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Preface 



In this book, we present a collection of papers around the topic of Agent- 
Mediated Knowledge Management. Most of the papers are extended and im- 
proved versions of work presented at the symposium on Agent-Mediated Knowl- 
edge Management held during the AAAI Spring Symposia Series in March 2003 
at Stanford University. 

The aim of the Agent-Mediated Knowledge Management symposium was to 
bring together researchers and practitioners of the fields of KM and agent tech- 
nologies to discuss the benefits, possibilities and added- value of cross-fertilization. 

Knowledge Management (KM) has been a predominant trend in busi- 
ness in recent years. Not only is Knowledge Management an important held of 
application for AI and related techniques, such as CBR technology for intelligent 
lessons-learned systems, it also provides new challenges to the AI community, 
like, for example, context-aware knowledge delivery. Scaling up research proto- 
types to real-world solutions usually requires an application-driven integration of 
several basic technologies, e.g., ontologies for knowledge sharing and reuse, col- 
laboration support like CSCW systems, and personalized information services. 
Typical characteristics to be dealt with in such an integration are: 

— manifold, logically and physically dispersed actors and knowledge sources, 

— different degrees of formalization of knowledge, 

— different kinds of (Web-based) services and (legacy) systems, 

— conflicts between local (individual) and global (group or organizational) 

goals. 

Agent approaches have already been successfully employed in KM for 
many partial solutions within the overall picture: agent-based workflow, cooper- 
ative information gathering, intelligent information integration, and personal in- 
formation agents are established techniques in this area. In order to cope with the 
inherent complexity of a more comprehensive solution, Agent-Mediated Knowl- 
edge Management (AMKM) deals with collective aspects of the domain in an 
attempt to cope with the conflict between the desired order and the actual be- 
havior in dynamic environments. AMKM introduces a social layer that structures 
the society of agents by defining specific roles and possible interactions between 
them. 

This workshop set the scene for the assessment of the challenges that Agent- 
Mediated Knowledge Management faces as well as the opportunities it creates. 
By focusing on agent-mediated interactions, specialists from different disciplines 
were brought together in a lively and inquisitive environment that provided nice 
interactions and debates. The main topics for the workshop were: 

— collaboration and P2P support, 

— agent-based community support, 

— agent models for knowledge and organizations, 




VI 



Preface 



— context and personalization, 

— ontologies and the Semantic Web, 

— agents and knowledge engineering. 

Besides extended versions of workshop presentations, this volume includes 
an introductory chapter, and papers originating from the invited talk and from 
discussion sessions at the symposium. The result is that this volume contains 
high-quality papers that really can be called representative of the field at this 
moment. 

This volume starts with an introduction to the Agent-Mediated Knowledge 
Management topic. The paper provides an extended motivation and an overview 
of research and current developments in the field. The remainder of the volume 
has been arranged according to the topics listed above. 

The first section contains four papers on collaboration and peer-to-peer sup- 
port. The first paper in this section by Bonifacio et al. proposes a P2P archi- 
tecture for distributed KM. Graesser et al. discuss the results of a study on the 
benefits for KM from intelligent interfaces, namely animated conversional agents. 
The third paper by Guizzardi et al. presents Help&Learn, an agent-based peer- 
to-peer helpdesk system to support extra-class interactions among students and 
teachers. The section ends with a paper by Ehrig et al. suggesting a concise 
framework for evaluation of P2P-based Knowledge Management systems. 

The second section contains three papers on agent-based community sup- 
port. The first paper by Schulz et al. presents a conceptual framework for trust- 
based agent-mediated knowledge exchange in mobile communities. Kayama and 
Okamoto examine knowledge management and representation issues for the sup- 
port of collaborative learning. The last paper in this section, by Moreale and 
Watt, describes a mailing list tool, based on the concept of a mailing list assis- 
tant. 

The third section is devoted to agent models for knowledge and organizations. 
Filipe discusses the coordination and representation of social structures based 
on using the EDA agent model for normative agents, combined with the notion 
of an information field. Lawless looks at the fundamental relations between the 
generation of information and knowledge, with agent organizations, decision- 
making, trust, cooperation, and competition. The third paper, by Furtado and 
Machado, describes an AMKM system for knowledge discovery in databases. 
Hui et al. report on experience using RDF to provide a rich content language for 
use with FIPA agent toolkits. The paper by Magallraes and Lucena, describing 
a multiagent architecture for tool generation for document classification, closes 
this section. 

The fourth section, on context and personalization, starts with a paper by 
Louga who presents a multiagent model to support decision-making in organiza- 
tions. Novak et al. introduce an agent-based approach to semantic exploration 
and knowledge discovery in large information spaces. The paper by Evans et al. 
looks at the use of agents to identify and filter relevant context information in in- 
formation domains. The section ends with a paper by Blanzieri et al., presenting 
the concept of implicit culture for personal agents in KM. 
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The fifth section contains four papers that focus on ontologies and the Seman- 
tic Web. Cao and Gandon discuss the benefits of societies of agents in a corporate 
semantic web. Krueger et al. look at ways to fully realize the potential of the 
Semantic Web, by automatically upgrading information sources with semantic 
markup. Hassan investigates interfaces to harness knowledge from heterogeneous 
knowledge assets. Cassin et al. present an architecture for extracting structured 
information from raw Web pages and describe techniques for extracting onto- 
logical meaning from structured information. The paper by Toivonen and Helin 
presents a DAML ontology for describing interaction protocols. The last paper 
in this section, by Petrie et al., discusses the benefits of agent technology to the 
development of Web services. 

The last section of the book contains six papers related to agent and knowl- 
edge engineering. The first paper, by Furtado et al., studies the relationship be- 
tween agent technology, knowledge discovery in databases, and knowledge man- 
agement. The paper by Molani et al. describes an approach to capture strategic 
dependencies in organizational settings in order to support the elicitation of re- 
quirements for KM systems. Bailin and Truszkowski discuss the role of perspec- 
tive in conflicts in agent communities. The paper by Tacla and Barthes concerns 
a multiagent system for knowledge management in R&D projects. Pease and Li 
introduce a system for collaborative open ontology production. Finally, the pa- 
per by Dodero et al. describes an agent-based architecture to support knowledge 
production and sharing. 

We want to conclude this preface by extending our thanks to the members 
of the program committee of the AMKM workshop and to the additional re- 
viewers who carefully read all submissions and provided extensive feedback on 
all submissions. We also want to thank all authors who were not only willing 
to submit their papers to our workshop and rework them for this book, but in 
addition contributed by their lively participation in a spontaneously organized 
peer review process. 
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Abstract. In this paper, we outline the relation between Knowledge Manage- 
ment ( KM) as an application area on the one hand, and software agents as a basic 
technology for supporting KM on the other. We start by presenting characteris- 
tics of KM which account for some drawbacks of today’s - typically centralized 
- technological approaches for KM. We argue that the basic features of agents 
(social ability, autonomy, re- and proactiveness) can alleviate several of these 
drawbacks. A classification schema for the description of agent-based KM sys- 
tems is established, and a couple of example systems are depicted in terms of this 
schema. The paper concludes with questions which we think research in Agent- 
mediated Knowledge Management (AMKM) should deal with. 



1 Agents and KM 

Knowledge Management (KM) is defined as a systematic, holistic approach for sustain- 
ably improving the handling of knowledge on all levels of an organization (individual, 
group, organizational, and inter-organizational level) in order to support the organiza- 
tion’s business goals, such as innovation, quality, cost effectiveness etc. (cp. [33]). 

KM is primarily a management discipline combining methods from human re- 
source management, strategic planning, change management, and organizational be- 
havior. However, the role of information technology as an enabling factor is also widely 
recognized, and - after a first phase where merely general purpose technology like In- 
ternet/Intranets or e-mail 1 were found to be useful for facilitating KM - a variety of 
proposals exist showing how to support KM with specialized information systems (see, 
e-g., [4]). 

One class of such systems assumes that a huge amount of organizational knowledge 
is explicitly formalized (or, "buried”) in documents, and therefore tries to “connect” 
knowledge workers with useful information items. Typical systems in this category are 
Organizational Memory Information Systems (OMs, cp. [1, 24]) which acquire and 

1 Especially large companies often report that these technologies were the first ways to commu- 
nicate and distribute knowledge across boundaries of hierarchies. 
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structure explicit knowledge and aim at high-precision information delivery services 
(“provide the right people with the right information at the right time”). 

On the other hand, expert finder systems or community of practice support don’t 
rely so much on explicitly represented knowledge, but rather bring people together, 
for instance, to solve a given knowledge-intensive problem (see, for instance [7, 28]). 
Although such systems also use some explicit knowledge, with respect to the actual 
knowledge-intensive task, this is more meta than problem-solving knowledge. 

Often, Information Technology (IT) research for KM focused on the comprehensive 
use of an organization’s knowledge, thus aiming at the completeness of distribution of 
relevant information. Technically, this is typically supported by centralized approaches: 
Knowledge about people, knowledge about processes, and domain knowledge is repre- 
sented and maintained as information in global repositories which serve as sources to 
meet a knowledge worker’s (potentially complex) information needs. Such reposito- 
ries may be structured by global ontologies and made accessible, e.g., through know- 
ledge portals [75, 52]). Or they may be rather “flat” and accessed via shallow (i.e., not 
knowledge-based) methods like statistics-based information retrieval or collaborative 
filtering (this is the typical approach of today’s commercial KM tools). 

In the following, we present some KM characteristics which - in our opinion - 
account for serious drawbacks of such centralized IT approaches to KM, and which can 
immediately be coined into requirements for a powerful KM system design: 

R1 KM has to respect the distributed nature of knowledge in organizations: The di- 
vision of labor in modern companies leads to a distribution of expertise, problem 
solving capabilities, and responsibilities. While specialization is certainly a main 
factor for the productivity of today’s companies, its consequence is that both gen- 
eration and use of knowledge are not evenly spread within the organization. This 
leads to high demands on KM: 

- Departments, groups, and individual experts develop their particular views on 
given subjects. These views are motivated and justified by the particularities of 
the actual work, goals, and situation. Obtaining a single, globally agreed-upon 
vocabulary (or ontologies) within a level of detail which is sufficient for all 
participants, may incur high costs (e.g., for negotiation). A KM system should 
therefore allow to balance between (a) global knowledge which might have or 
might constitute a shared context, but may also be relatively expensive; and (b) 
local expertise which might represent knowledge that is not easily shareable or 
is not worth sharing. 

- As global views cannot always be reached, a KM system has to be able to 
handle context switches of knowledge assets, e.g., by providing explicit pro- 
cedures for capturing the context during knowledge acquisition and for re- 
contextualizing during knowledge support. An example for context capturing is 
a lessons-learned system which is fed by debriefings after a project is finished 
[43, 42]. Here, a typical question pair is: “What was the most crucial point of 
the project’s success? What are the characteristics of projects where this point 
may also occur?” 

Altogether, we see that distributedness of knowledge in an organizational memory 
is not a “bug”, but rather a “feature”, which is by far not only a matter of physical 
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or technical location of some file. It has also manifold logical and content-oriented 
aspects that in turn lead to derived aspects such as — in an ideal system — the need 
to deal with matters of 

- trust (Do I believe in my neighbor's knowledge?), 

- responsibility (Is my neighbor obliged to maintain his knowledge base because 
I might use it? And am I obliged to point out errors that I find in his knowledge 
base?), 

- acknowledgement (Who gets the reward if I succeed with my neighbor’s know- 
ledge?), 

- contextuality of knowledge (Is my neighbor’s knowledge still valid and appli- 
cable in my house and my family?), 

- ... and many others. 

R2 There is an inherent goal dichotomy between business processes and KM processes'. 
For companies as a whole as well as for the individual knowledge worker KM pro- 
cesses do not directly serve the operational business goals, but are second order 
processes 2 . Within an environment of bounded resources, knowledge workers will 
always concentrate on their first order business processes. This means they opti- 
mize their operational goals locally and only invest very little to fulfil strategic, 
global KM goals. 3 It is clear and pretty well accepted that having and using know- 
ledge is important for optimally fulfilling first-order tasks, but the workload and 
time pressure is nevertheless usually so high that the effort invested for preparing 
this, time for knowledge conservation, evolution, organization, etc., is considered 
a second-order process often neglected in practice. Even cumbersome activities for 
knowledge search and reuse are often considered to be unacceptable. Therefore, 
the KM processes should be embedded in the worker’s first-order processes, and 
proactive tools should minimze the cognitive load for KM tasks. 

R3 Knowledge work as well as KM in general, is “wicked problem solving” (cf. [15, 
21, 22]): This means that a precise a-priori description of how to execute a task 
or solve the problem doesn’t exist, and consequently, it cannot be said in advance 
when knowledge should be captured, distributed, or used optimally. An optimal so- 
lution for KM problems and the respective knowledge and information flows cannot 
be prescribed entirely from start to finish, because goals may change or be adapted 
with each step of working on a task. Therefore knowledge workers and KM sys- 
tems must be flexible enough to adapt to additional insights and to proactively take 
opportunities when they arise during work. Solving “wicked problems” is typically 
a fundamental social process. A KM system should therefore support the neces- 
sary complex interactions and underlying, relatively sophisticated processes like 
planning, coordination and negotiation of knowledge activities. 

A phenomenon closely related to this is that KM is very much about personal rela- 
tionships. People want to be recognized as experts, and they are much more willing 

2 There are a couple of exceptions to this, like R&D departments which have knowledge gen- 
eration as first order goal. For a discussion of operational processes vs. knowledge processes, 
see, for instance, [68. 78]. 

3 In other words, employees will mostly find a way to get their business done, even if processes 
and tool support are bad, whereas KM tasks will simply be omitted. This has been our experi- 
ence in KM systems building from our very first requirements gathering on [47], 
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to share knowledge face-to-face in collaborative problem-solving and expert chats 
than putting it anonymously into a central knowledge store. Hence flexible point- 
to-point connections for powerful online communication and collaboration, as well 
as individual solutions for knowledge storage, identification, and communication 
must be allowed. 

R4 KM has to deal with changing environments'. In addition to the intrinsic problems 
described above, KM systems typically reside in environments which are subject to 
frequent changes, be it in the organizational structure, in business processes, or in 
IT infrastructure. Centralized solutions are often ill-suited to deal with continuous 
modifications in the enterprise, e.g., because the maintenance costs for detailed 
models and ontologies simply get too high. 

Furthermore, the implementation of KM systems often follows a more evolution- 
ary approach where functionalities are not implemented “in one step” for a whole 
company, but partial solutions are deployed to clearly separated sub-structures. In 
order to obtain a comprehensive system, these elements then have to be integrated 
under a common ceiling without disturbing their individual value. 4 

Keeping these requirements in mind, let’s have a look at scenarios which are con- 
sidered to be rewarding tasks for agent-based software solutions. We quote a number 
of characteristics from [60] (but similar arguments can be found in many books about 
multi-agent systems) typically indicating that a scenario could be a good application 
area for agent technology: agents are best suited to applications that are modular, de- 
centralized, changeable, ill-structured, and complex. 

Although the match between these five salient features and the KM requirements 
R1 - R4 listed above is already obvious, we want to elaborate a bit more explicitly on 
this match. Let us start with the weak definition of agents [83] (with the definitional 
features autonomy, social ability, reactive behavior, and proactive behavior). Now we 
will see why agent-based approaches are especially well-suited to support KM with 
information technology: 

In the first place, the notion of agents can be seen as a natural metaphor to model 
KM environments which can be conceived as consisting of a number of interacting en- 
tities (individuals, groups, IT, etc.) that constitute a potentially complex organizational 
structure (see Rl, but also R4). Reflecting this in an agent-based architecture may help 
to maintain integrity of the existing organizational structure and the autonomy of its sub- 
parts. Autonomy and social ability of the single agents are the basic means to achieve 
this. 

Reactivity and proactivity of agents help to cope with the flexibility needed to deal 
with the “wicked” nature of KM tasks (see R3). The resulting complex interactions 
with the related actors in the KM landscape and the environment can be supported and 
modeled by the complex social skills with which agents can be endowed. 

Proactiveness as well as autonomy help accomodating to the reality that knowledge 
workers typically do not adopt KM goals with a high priority (see R2). 

4 This requirement of connecting several smaller existing KM islands to create a bigger picture, 
also fits very well with the frequently suggested KM introduction strategy of looking for “quick 
wins” (cp. [81]). 
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Regarding primarily the software-technology aspects of agents, they represent a way 
of incorporating legacy systems into modern distributed information systems; wrap- 
ping a legacy system with an agent will enable the legacy system to interact with other 
systems much more easily. Furthermore, agent approaches allow for extensibility and 
openness in situations when it is impossible to know at design time exactly which com- 
ponents and uses the system will have. Both arguments reflect pretty well the technical 
consequences of abstract requirements such as R4 and R3 (changing environments de- 
mand continuous reconfiguration, the unpredictable nature of wicked-problem solving 
require flexible approaches), R2 (competition between operational work and KM meta 
work call for stepwise deployment and highly integrated KM solutions), or R1 (already 
existing local solutions must be confederated). 

There have been a number of more or less theoretical analyses of requirements and 
ambitious approaches to agent-based solutions for KM (see, e.g., [56, 72]), as well 
as experimental systems exploring the use of agents for investigating the one or other 
aspect (such as weakly-structured workflow, ontology mediation, metadata for know- 
ledge retrieval, or contextuality) of comprehensive agent-based KM frameworks (like 
FRODO, CoMMA, Edamok [3, 31, 11, 36], some of them are included in this book). 
We are well aware that nowadays we are far from reaching a state where we can over- 
see all methodological, technological, and practical benefits and prospects, problems 
and pitfalls, and challenges and achievements of Agent-Mediated Knowledge Manage- 
ment. But we hope and we are pretty sure that this paper as well as this volume gives 
a good idea of the AMKM landscape, opens up some new ways for interesting future 
work and shows how far we have already come. 



2 A Description Schema for Agent-Based KM Approaches 

In research as well as in first generation “real-world applications” several agent-based 
systems exist to support various aspects of Knowledge Management, from personal 
information agents for knowledge retrieval to agent-based workflows for business pro- 
cess-oriented KM. In order to be able to compare different agent approaches to KM, 
we need to describe agent and multi-agent architectures in a way that abstracts from the 
particularities of individual implementations, but still captures their relevant character- 
istics. A couple of helpful classification schemas for single agents and multi-agents 
systems have already been proposed (e.g., Franklin and Graesser’s taxonomy of agents 
[35]), discriminating agents for example by their tasks (information filtering, interface 
agents etc.), their abstract architecture (e.g., purely reactive vs. agents with state) or 
concrete architecture (e.g., belief-desire-intention vs. layered) architectures (cf. [82]), 
or other specific features (mobility, adaptivity, cooperativeness, etc.). 

For instance, [61] presented an interesting top-level characterization of agent appli- 
cations, basically distinguishing three kinds of domains: 

1. Digital domains where the whole environment of the agents is constituted from 
digital entities, as is the case, e.g., in telecommunications or static optimization 
problems. 

2. Social environments where software agents interact with human beings. 
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3. Electromechanical environments where agents manipulate and experience the non- 
human physical world via sensors and actuators, as is the case, e.g., in robotics, 
factories, etc. 

A further classification dimension can be added directly because besides the domain to 
be handled by the agents we also have to consider the kinds of interfaces to be provided 
by an agent-based application. Here we have the same options as above: we need so- 
cial interfaces to integrate people, digital interfaces to interact with other agents, and 
electromechanical interfaces to link to the physical world. In the case of KM applica- 
tions we normally have to consider a (highly) social environment with both social and 
(usually a number of different) digital interfaces. 

For the purpose of this paper, we propose a description schema that is on the one 
hand more specific than these classifications and on the other hand also captures the 
whole life cycle of agent-oriented system development. To get an overview of agent 
approaches for KM, we think that a categorization along three dimensions is especially 
beneficial: 

1. the stage in a system’s development process where agents are used (analysis, con- 
ceptual design, or implementation); 

2. the architecture / topology of the agent system; and 

3. the KM functionality / application focused on. 

We discuss these dimensions in the following three subsections. 

2.1 System Development Level 

Agent-oriented Software Engineering emphasizes the adequacy of the agent metaphor 
for design and implementation of complex information systems with multiple distinct 
and independent components. Agents also enable the aggregation of different function- 
alities (such as planning, learning, coordination, etc.) in a conceptually embodied and 
situated whole [51]; agents also provide ways to relate directly to these abstractions in 
the design and development of large systems. 

In Knowledge Management, not only are the IT systems highly complex and dis- 
tributed, but also the organizational environment in which these systems are situated. 
Especially in more comprehensive KM approaches, the complexity of the organization 
has to be reflected in the IT architecture. Often, “real world entities” of the organization 
have a relatively direct counterpart in the computer system, leading to a rather tight 
coupling between the real and the virtual worlds. Therefore, an organizational analy- 
sis is commonly an integral part of methodologies for the development of Knowledge 
Management IT (see, e.g.. the CommonKADS [74], or the DECOR [59] methods). 
Originating in the realm of human collaboration, the notion of agents can be an epis- 
temologically adequate abstraction to capture and model relevant people, roles, tasks, 
and social interactions. These models can be valuable input for the requirements analy- 
sis phase for the development of the KM system. 

So, due to the fundamentally social nature of KM applications, the agent paradigm 
can be — and actually has been — applied at different development levels, such as 
analysis, modeling and design, and not just to represent technological components of 
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Fig. 1. Notion of Agents at Different Stages in the Development Cycle of an Agent- 
Based KM System 



implemented systems. Figure lgives an overview of the use of agents on different levels 
in the system engineering cycle. Of course, on each level we can have different specific 
a gent theories (that is, how agents are conceptualized, what basic properties they have, 
etc. [83]) and respective representation languages (which on the implementation level 
may be operational programming languages) for defining concrete agents and their rela- 
tions. Methodologies for agent-oriented software engineering like Tropos [40] and Gaia 
[85] not only define these representation languages for different levels, they are also the 
glue between them by providing mappings and processes for the transition from one 
level to another. The hope is, of course, that on the basis of a high correspondence of 
the primitives on each level these transitions will be smooth and less error-prone. Even 
though such methodologies provide a powerful tool to design multi-agent systems, and 
are currently widely used, they are not always suitable to deal with the complexity of 
fully fledged KM environments, including openness and heterogeneity. In [27] overall 
design requirements for KM environments were identified, which include the need to 
separate the specification of the organizational structure for the internal architecture of 
its component entities, and the need for explicit representation of normative issues. A 
recent proposal for a methodology for agent societies that meets these requirements, is 
presented in [25]. 

However, even when it seems likely that the entire development life cycle for KM 
applications can benefit from the concept of agents, we are well aware that in concrete, 
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real-life situations often pragmatic reasons 5 may lead to the use of agents at just one 
or two development levels. On the other hand, having to implement a KM system on 
the basis of “conventional software” (like relational databases or client/server-based 
groupware solutions) or on the basis of modern, strongly related technologies like peer- 
to-peer networks or web services should not necessarily hinder an agent-oriented anal- 
ysis and system design 6 . 

2.2 Macro-level Structure of the Agent System 

Agent theories, abstract agent architectures, and agent languages as defined in [83] 
mainly take a micro-level view, i.e., they focus on the concept of one agent: What 
properties does an agent have, how can these properties be realized in a computer 
system, what are the appropriate programming languages for that? For Knowledge 
Management — which typically employs a strong organizational perspective — the mac- 
ro-level structure is also of special interest. How many agents do we have? What types 
of agents? What is the topology with respect to the flow of information, or with respect 
to the co-ordination of decisions? One possible dimension to characterize the macro- 
level of an agent-based KM system is the degree of sociability as depicted in Figure 2: 



Single Agent 


H om ogeneous MAS 


(H eteiogeneous ) 
Agent Societies 


• Personal 
Hhtbim atdon 
Agent 


• Cooperative 
R etrieval 
Agents 


•Agent-based 
OM Architecture 


•Agent-based 

Distributed OM Architecture 



Fig. 2. Degree of Sociability 



- Single-agent architectures are at one end of the spectrum. Typical examples come 
from the area of user interface or personal information agents which build a model 
of a user’s interest and behavior, and exploit this knowledge to support him or her 
by providing relevant information, e.g., from the Web. These agents can perceive 
their environment and access some objects like web resources, but they normally 
have no elaborated interaction (like collaboration or negotiation) with other agents 
(except for the human user). 

5 In [84], Wooldridge and Jennings nicely describe classes of pitfalls for the development of 
agent-based systems, including the “overselling”, “being dogmatic”, and “agents as silver 
bullet” pitfalls. Parunak [60, 61] also discusses the pragmatics of agent-based software devel- 
opment in real-world settings. 

6 Actually we observe that technologies like P2P and web services incorporate many aspects 
of the notion of agents when encountering application domains that have characteristics like 
those described in Section 1 for KM applications. 
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- Homogeneous multi-agent architectures already have a higher degree of sociability. 
Agents can co-operate with other agents in order to solve their tasks. Homogeneity 
means that the system consists mainly of one type or class of agents. These agents 
do not necessarily have to have exactly the same goals, but their tasks and capabil- 
ities are comparable. Agent-based collaborative filtering is a typical example for 
this class of MAS: All agents are seen as peers which can provide information on 
what entities they use or like, and each agent can collect this information to provide 
the user with valuable hints about interesting new information. Nevertheless, all 
agents may have individual information collection and integration strategies. 

- Heterogeneous multi-agent architectures contain multiple agent classes which may 
have completely different purposes, knowledge and capabilities. Various informa- 
tion integration architectures (e.g., Knowledge Rovers [44], MOMIS/MIKS [8]) 
are described as heterogeneous MAS: Specialists exist for wrapping information 
sources, agents for integrating different description schemas, and for adequately 
presenting information to the users. All these different agent types have to co- 
operate and bring in their complementary expertise in order to accomplish the over- 
all goal of the system. 

A characterization of the macro-level structure of an agent-based KM system may, 
in addition to the description of the number of agents and the system’s heterogeneity, 
also include facets like 

- co-ordination form: How are decisions and information flow coordinated? On the 
basis of a market model? As a fully connected network? Or in a hierarchical man- 
ner? 

- open vs. closed system: Can new agents enter the system? If yes, does their agent 
class (competencies, purpose, etc.) have to be known in advance? Or can new types 
of agents be integrated easily (even at runtime)? 

- implicit vs. explicit social structure: Do the agents have an explicit representation 
of their role in the system which allows for a certain assurance of the system’s 
global behavior? Do they even have a machinery for reasoning about their rights 
and obligations? Are roles globally defined or negotiated? Or is the agent’s social 
behavior only locally controlled and the system’s behavior completely emergent? 

Electronic institutions are a typical example of a complex society architecture. Elec- 
tronic institutions provide a computational analogue of human organizations in which 
agents interact through roles that are defined as specified patterns of behavior [79]. 
Similarly, virtual organizations can potentially take advantage of the new electronic 
environments through coalition formation among disparate partners to form aggregate 
entities capable of offering new, different or better services than might otherwise be 
available. To design such systems requires a theory of organization design, and know- 
ledge of how organizations may change and evolve over time. Sociological organization 
theory and social psychology are clearly important inputs to the design. Moreover, for 
the design of open societies, political theory may be necessary. Open systems permit 
the involvement of agents from diverse design teams, with diverse objectives, which 
may all be unknown at the time of design of the system itself. How the system as a 
whole makes decisions or agrees on joint goals will require the adoption of specific 
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political philosophies, for example whether issues are subject to simple majority voting 
or transferable preference voting, etc. (cp. [51]). 

Of course, the above examples for different degrees of sociability — single-agent, 
homgeneous/heterogeneous MAS — do not form a discrete, categorical discrimination. 
On the contrary they are exemplary operating points on a continuous scale. Hetero- 
geneous MAS, e.g., may have sub-societies that are homogeneous themselves. Or, a 
system may be mainly homogeneous, but has one specialist agent for a certain task. 
And even between several aspects of communication the structure of the system may 
differ. So, the topology for making decisions may be a hierarchy, while information may 
be spread based on a market or fully connected network model [29]. It is also clear that 
there are dependencies between the three facets of system description. Not all possible 
combinations fit equally well together, not all of them are equally useful. For instance, 
if we have a highly structured agent society (like the electronic institutions outlined 
above) we can normally profit from known social structures when designing effective 
co-ordination, communication, and decision-making mechanisms, and do not have to 
use such a general, but “expensive” mechanism as a fully connected network. On the 
other hand, the more social structure is explicitly implemented into an agent society, the 
more “closed” this society might be in the sense that entering it will probably be based 
on a well-specified procedure, depending both on the current status of the society and 
on the capabilities and goals of a new agent that wants to enter it. On the other hand, 
if we have a relatively “democratic” way of co-ordination, like a market model, and a 
completely implicit social structure, it might be pretty easy for such an agent society to 
act as a pretty “open” system. 



2.3 KM Application Area 

The two classification dimensions for multi agent systems described in the previous 
subsections are not directly related to applications in the KM domain. Up to now, we 
looked at the level in system development where the notion of agents is used, and at 
the macro-level structure of the agent system 7 . The third dimension for characteriz- 
ing agent-based KM applications, described in this subsection, deals with the specific 
knowledge management functionality of the system: What is the scope of the systems? 
Which Knowledge Management processes or tasks are supported? 

In this paper, we do not want to prescribe a detailed framework for this dimension, but 
only want to gather and offer some possibilities and general directions. 

Principally, all high-level Knowledge Management models can be seen as a starting 
point to form the vocabulary for this dimension, and there are many such KM models. 
We will start with the famous KM cycle by Probst et al. [64] which — in addition to 
the management-oriented tasks of defining knowledge goals and assessing the organi- 
zation 's knowledge — e.g., identifies six building blocks: 

- Identification processes analyze what knowledge exists in an organization, what the 
knowledge containers are, who the stakeholders are, etc. 

7 Though it should be noted that the emphasis on the degree of sociability as an important di- 
mension of characterization is strongly biased by our theoretical analysis of KM in Section 1. 
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- Acquisition is the process of integrating external knowledge into an organization. 

- Development processes generate new knowledge in the organization. 

- Distribution processes connect knowledge containers with potential users. 

- Preservation aims at the sustainability of knowledge, i.e., that is accessible and 
understandable over a period time. 

- Utilization means to operationalize available knowledge in order to solve actual 
business tasks (better). 

Originating in the management sciences, Probst et al.’s view has been widely adopted 
and adapted in technology-oriented KM literature (e.g., [1, 76]). Likewise, the classical 
model of Nonaka and Takeuchi [58] — which focuses on knowledge generation — can 
be used to describe the KM application area of a system. These authors claim that 
new knowledge is created by four types of transformation processes between implicit 
/ internal knowledge (e.g., competencies, experiences, skills) and explicit / external 
knowledge (e.g., facts, coded rules, formal business processes): 

- With socialization, knowledge that is implicit to a person is transferred to another 
person by sharing experiences. Apprenticeship learning, for example, makes heavy 
use of socialization. 

- Externalization is the process of making implicit knowledge explicit, e.g., by talk- 
ing about it, writing it down informally or by formalizing it. Knowledge acquisition 
techniques developed in expert system research mainly aim at externalization. 

- Combination is the basis for generating new knowledge from external knowledge 
by relating knowledge pieces with other knowledge pieces. Data mining and ma- 
chine learning are technical approaches of this type of knowledge creation process. 

- Internalization is the transformation of explicit knowledge into implicit knowledge 
and thereby making it applicable. 

From these classical models, several further distinctions have been developed in Know- 
ledge Management research that can be utilized to describe the application area. For 
example, systems can take a more process-oriented or a more product-oriented view 
1 47, 54]. The latter emphasizes the management of explicit knowledge contained in 
’’kowledge products” such as databases, documents, formal knowledge bases etc.; the 
former focuses on human beings and their internal knowledge, i.e., the ’’process of 
knowing” and the ’’process of knowledge exchange” between people. Typical systems 
with a product-oriented view are document retrieval agents. Expert finder systems, on 
the other hand, take a more process-oriented view. Furthermore, a KM system can sup- 
port individuals and their tasks at hand, it can support teams and groups, or it may 
take a more global, organizational perspective. The theoretical analysis of Knowledge 
Management characteristics in Section 1 may be the source of further possible applica- 
tion areas for information technology, e.g., facilitating trust, motivating users to share 
knowledge, or establishing group awareness. 

Concrete agent-based KM applications may deal with one or a few of these aspects, 
or they may be more comprehensive frameworks that try to cover large parts of the KM 
cycle. In the following section we will analyze existing agent-based KM applications, 
illustrative for the different approaches. 
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3 Exemplary Agent-Based KM Applications 

In the previous section, we proposed three dimensions to describe agent-based Know- 
ledge Management systems: i) the system development level (analysis, design, imple- 
mentation), ii) the macro-level structure of the system (single agent, heterogeneous, or 
homogeneous MAS), and iii) the KM application area (knowledge distribution, gener- 
ation, use, etc.). In this section, we will present some examples of agent-based systems 
developed to support and/or model Knowledge Management domains. We group these 
systems by the second dimension (macro-level structure), because this also largely re- 
flects and matches the historical evolution of research in this area. Since compiling a 
complete overview of the systems in all three dimensions is well beyond the scope of 
this paper, we briefly sketch some systems which we consider typical for the specific 
approach. Our aim is to present current developments in Agent-Mediated Knowledge 
Management, indicate their differences to conventional approaches, expose their bene- 
fits, and suggest areas for further work. 

3.1 Predominantly Single Agent Approaches 

Most KM support systems that take a single agent approach are User Interface Agents or 
Information Agents. A User Interface Agent embodies the metaphor of “a personal as- 
sistant who is collaborating with the user in the same work environment” [53]. Though 
this rather general definition would comprise agent support for all kinds of KM activ- 
ities that a knowledge worker can perform (e.g., distribute knowledge, generate new 
knowledge), virtually all systems in this class are information agents 8 . These agents 
typically 

- have access to a variety of information sources, 

- handle a model of the user’s information needs and preferences, and 

- try to provide relevant information to the user in an adequate way, either by filtering 
incoming information from the sources or by actively retrieving it. 

Prototypical systems in this category use e-mail in-boxes, news forums, dedicated KM 
databases within the company, intranet documents, or internet search engines as infor- 
mation sources. 

A representative architecture for an intelligent information agent that assists the user 
in accessing a (not agent-based) Organizational Memory, in this case the OntoBroker 
system, is described in [77, 73]. The agent relies on an explicit model of the business 
process the user is engaged with and uses this knowledge of the work context to de- 
termine when information support may be appropriate and what information may be 
useful in that context. 

Two variants of the system are available, a reactive and a proactive one. In the re- 
active case, the user triggers the agent by selecting a specific (pre-modelled) query 
in a specific application context. The agent then tries to retrieve relevant knowledge 
from the Organizational Memory and passes it on directly to the respective application 



For an overview of personal information agents, also for other tasks like expert finding and 
information visualization see [49], 
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that triggered the information need and thereby to the user. The reactive agent must 
have complete knowledge about the process context and about the information needs. 
The proactive agent, on the other hand, relaxes these two requirements, i.e., the appli- 
cation context and the relevant queries may be only partially defined when the agent 
becomes active. Instead, the agent has a proactive inferencing mechanism which em- 
ploys heuristics to retrieve relevant information based on uncomplete context and query 
specifications. In order to cope with the potentially huge number of possible results and 
related problems (e.g. storage, processing time) from inferencing with underspecified 
context, the proactive agent is equipped with a mechanism for bounded resource con- 
sumption. For their actual knowledge retrieval step, OntoBroker agents, both reactive 
and proactive, exploit the ontology-based structure of the Organizational Memory . 

However, many personal information agents are designed for an environment where 
such an ontological structure of the information sources cannot be assumed, e.g., the 
World Wide Web. In this case, agents often rely on standard information retrieval tech- 
niques for searching. Rhodes and Maes [70] present three just-in-time information re- 
trieval (JITIR) agents: The Remembrance Agent continually presents a list of documents 
that are related to a document that is currently being written or read in the Emacs editor, 
Margin Notes uses documents loaded in a Web Browser as context, and Jimminy uses 
the physical environment (location, people in the room, etc.) to determine what infor- 
mation may be relevant. All three agents use the same back-end system Savant [69 [ for 
the actual information retrieval step. 

Nevertheless, the primary contributions of research in personal information agents 
are not so much the various core retrieval techniques (from statistics-based similari- 
ties of text documents up to ontology-based access to formalized knowledge items), 
but the development of adequate sensors and effectors for personal information agents. 
Sensors define the way the agents can assess the context of their services, i.e., when 
to perform a service proactively and what the user’s actual information need is. Here, 
a wide range of approaches are covered in literature, from the pre-modelled business 
processes described above, to observing knowledge workers in their usage of standard 
office applications like text processors, web browsers or mailing tools (cf. Watson [17] 
or Letizia [48]). 

The effectors of user interface agents, on the other hand, determine the way informa- 
tion can be presented to the user. The JITIR agent Margin Notes [70], for example, 
automatically rewrites Web pages as they are loaded, and places links to personal infor- 
mation items in a dedicated area of the page. Watson presents suggestions in a dedicated 
window, and in KnowMore [2], information from the Organizational Memory can be 
directly handed over to specific fields in a form-based application. 

We now discuss the characteristics of personal assistants along the other two char- 
acterization dimensions for AMKM applications described in section 2. Concerning the 
level of system development, personal assistant approaches are mostly deployed at the 
modelling level. The most relevant aspect used from the agent metaphor is that an agent 
acts on behalf of a user who has specific goals and interests. Regarding the implemen- 
tation level, personal assistants are currently mostly implemented using conventional 
programming techniques, i.e., without using a more general “agent development kit for 
personal information agents”. A well-known exception is Letizia, developed at MIT 
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[48]. With respect to the KM application area, personal assistants, as user-directed ap- 
proaches, are mainly related to the dissemination of knowledge to be used by knowledge 
workers, in a just-in-time, just-enough fashion. Applications such as OntoBroker take 
a product-oriented view on knowledge, as they emphasize the management of explicit 
knowledge sources. 

To sum up, we can say that many of the presented ideas are already well-developed 
in the technological sense, and some of them have even found their way into commer- 
cial software products of advanced vendors. In those applications, the agent term is 
often not used in the narrower technical sense, but merely as a communication or as 
a design metaphor, but not built upon dedicated agent software platforms. There is a 
clear, but indirect, link between the functionalities achieved by such systems and our 
KM software requirements R1 - R4 defined above. Usually one can see that the soft- 
ware functionalities provided here are useful, because they address issues caused by 
our items R1 - R4 (e.g., in frequently changing environments, push services achieved 
by personal information agents are much more important than in stable environments, 
since an agent can continuously monitor whether some relevant change has happened). 
Altogether, though the software functionalities are stable to some extent and appar- 
ently useful, the logical next step for research and application has seldom been done, 
namely a rigorous assessment of usability and usefulness issues. There are a few spe- 
cific experiments about evaluation of Personal Information Agents and the influence of 
process-aware, proactive information delivery, respectively (see [16, 18, 32, 70]), but 
in our opinion there is still a need for broad and long-term experiments about usability 
issues, user acceptance, and influence on working behavior and working efficiency / 
effectiveness by KM tools. 



3.2 Homogeneous Multi-agent Approaches 

As described in Section 2.2, homogeneous multi-agent systems are formed by several 
agents belonging mainly to a common “agent class”, i.e., on an abstract level they have 
comparable competencies and goals (albeit they might act on behalf of different users) 9 . 
Pure homogeneous multi-agent systems are rarely found in literature. Typically, facili- 
tation functions (e.g., matchmaking and management of collaboration) are encapsulated 
as (centralized) service agents, different from the other agents, which might be homo- 
geneous. Examples of such “weakly homogeneous” systems, mostly specialized on one 
KM task, are presented later in this section. 

An obvious extension to the personal information agents described in the previous 
section is to see each user not only as an information consumer, but also as a provider. 
In this case, besides retrieval and presentation support, the personal agent should assist 
the user in serving as a source of information. A very simple example for such agents 
are the clients for peer-to-peer file sharing support like Kazaa, ED2K, or - in the do- 
main of learning resources - Edutella [57]. These agents have specialized interfaces for 
expressing queries, passing them on to other agents and displaying the results. But they 

9 This definition identifies homogeneous multi-agent systems as close conceptual relatives of 
peer-to-peer (P2P) systems, even though their implementational basis can be quite different 
(cf. Section 2.1). 
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are also able to receive queries and process them by answering with result documents 
or by passing a query to other agents. Such interaction between different personal as- 
sistants can be considered as a multi-agent system. In the following, more elaborate 
approaches are also described. 

MARS, an adaptive social network for information access described and evaluated 
in [86] has a purely homogeneous structure that is based on the idea described in the 
previous paragraph. Each agent basically has two competencies: i) to deliver some do- 
main information with respect to a query, and ii) to refer to other agents that may fulfill a 
specific information need. Additionally, the agents learn assessments of the other agents 
in the network with respect to the two aspects. This means they assess the other agents’ 
expertise (ability to produce correct domain answers) as well as their ability to produce 
accurate referrals. 

DIAMS [20] is a system of distributed, collaborative information agents that help 
users access, collect, organize and exchange information on the World Wide Web. DI- 
AMS aims at encouraging collaboration among users. Personal agents provide their 
owners with dynamic views on well-organized information collections, as well as with 
user-friendly information management utilities. These agents work closely together 
with each other and with other types of information agents such as matchmakers and 
knowledge experts to facilitate collaboration and communication. In order to promote 
easy information sharing and exchange, an object-based structure is used for the infor- 
mation repositories. DIAMS furthermore uses a flexible hierarchical presentation of in- 
formation integrated with indexed query functionalities to ensure effective information 
access. Automatic indexing methods are employed to support translation between user 
queries and communication between agents. Collaboration between users is aided by 
the easy sharing of information and is facilitated by automated information exchange. 
Connections between users with similar interests can be established with the help of 
matchmaker agents. 

The focus of the research described in [62] is to add context-awareness to per- 
sonal information agents that are (homogeneous) peers in a larger society of agents. 
The so-called CAPIAs (Context-Aware Personal Information Agents) have a model of 
their social and potential process context (e.g., the user’s schedule) as well as of their 
physical context (time and location). In the COMRIS Conference Center system the 
CAPIAs are employed for context-sensitive presentation of relevant information, e.g., 
whether “interesting” conference attendees or events (sessions, exhibition booths) are 
to be found nearby. 

Homogeneous multi-agent approaches in Knowledge Management seem to be a 
good way for leveraging single-agent approaches by taking advantage of the know- 
ledge of other users in the organization. In the GroupLens project these leveraging ef- 
fects are systematically investigated [41]. However, such systems are often not designed 
as agent systems. Due to their focus on one KM task (e.g., recommendation of one spe- 
cific type of information objects) and a relatively controlled environment, centralized 
implementations are common. For example Let’s Browse [50], the successor of the per- 
sonal information agent Letizia [48], does not model its collaborative web browsing as 
a cooperation between independent agents, but as one central agent that comprises the 
profiles of several users. An interesting but open question is to what extent multi-agent 
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modelling has an “added value” (e.g., wrt. user trust, privacy concerns, willingness to 
disclose information, ...) compared to a “functionally” (e.g., with respect to the quality 
of the recommendations) equivalent monolithic system. 

As with the single agent approaches presented above, homogeneous multi-agent 
systems applications to KM are mainly seen at the modelling level of development. On 
the other hand, in relation to the KM dimension, multi-agent approaches are mostly 
directed to the modelling of collaboration and interaction between users and systems, 
that is, with socialization issues. While most systems still lean considerably towards a 
product-oriented view of knowledge, these systems take a more process-oriented view 
on the management of knowledge than single agent approaches do, and can support 
teams and groups, as well as individual users. Homogeneous multi-agent approaches 
mostly provide a multiplication of a single-agent, and as such may not be able to sup- 
port enough depth needed at the analysis and design level for comprehensive KM. 
Complex KM domains often require the combination of global and individual perspec- 
tives, and activities to follow desired structures, while enabling autonomous decisions 
on how to accomplish results. In order to cope with these requirements, heterogeneous 
approaches may be more appropriate, such as those described in the next subsection. 



3.3 Heterogeneous Multi-agent and Society-Oriented Approaches 

Heterogeneous multi-agent systems not only consist of a potentially high number of 
agents, but these agents also belong to different classes. This means the agents have 
diverse competencies and types of goals. The heterogeneity can be due to the large 
number of “real-world” entities of the organization that are reflected in the system, or 
due to a purely functional decomposition from a software engineering point of view. 
Also, the more Knowledge Management functions a systems covers, the more hetero- 
geneous the system will be. The systems we present in this section comprise both types 
of heterogeneity. Some of them only have a limited scope in terms of KM functionality 
(e.g., storing and retrieving knowledge objects), but encapsulate various service func- 
tions in separate specialized agents. Others are meant to be more comprehensive KM 
backbones and therefore employ agents for more diverse aspects like process support, 
retrieval support, and personalization. The society-oriented approaches we sketch at the 
end of this section demonstrate a potential way to cope with this heterogeneity and the 
complexity of such systems. 

The design of many agent-based Knowledge Management systems emerges from 
the “standard” three-tier enterprise information architectures that are often the basis for 
business applications (e.g., [55, 34, 45] and others): 

- The data layer manages repositories with knowledge objects such as documents, 
e-mail, etc. 

- The application layer realizes the business logic of the system. 

- The presentation layer organizes the interaction of the system with its users. 

KAoS [14, 19], a generic agent architecture for aerospace applications, is quite an 
early agent-based system for the management of technical information contained in 
documents, that is based on such a layer model. Aiming mainly at flexible information 
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delivery from heterogeneous information sources in a distributed environment, KAoS 
employs agents on all three layers. In addition, a layer with generic service agents pro- 
vides the middleware functionality of an agent platform (whitepage and matchmaking 
services for agents, proxies for connections to other agent domains, agent context man- 
agement). The data services wrap the information sources by encapsulating indexing, 
search and retrieval functions, but also monitor them to allow for proactive information 
push. The prototype system Gaudi uses the KAoS platform for situation-specific, adap- 
tive information delivery in the context of training and customer support in the airplane 
industry [13]. Recent versions of KAoS also incorporate social aspects in agent com- 
munities [34], However, the relevance of this approach for Knowledge Management 
applications has not yet been discussed. 
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Fig. 3. Three-layer KM Architecture [45] (reprinted with kind permission) 



The focus of KM systems based on a layer architecture like the one presented above 
is mostly the reuse of information contained in the information sources. Consequently, 
the knowledge flow is mainly from the data layer to the presentation layer. The concep- 
tual model for Knowledge Management that Kerschberg presents with his Knowledge 
Rover architecture [44] does not have this principal restriction. He broadens the pre- 
sentation layer to a Knowledge Presentation and Creation Layer , which also comprises 
discussion groups and other types of potential knowledge creating services [45] (cf. Fig- 
ure 3). Hence, knowledge flow from the presentation to the data layer is also taken into 
account. Consequently, the application layer comprehensively embraces all basic KM 
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processes — acquisition, refinement, storage/retrieval, distribution, and presentation of 
knowledge (cf. Section 2.3). 

For a knowledge reuse-oriented view, the integration of information from various 
sources (cf. [80]) is essential. One project that deals explicitly with the fusion of know- 
ledge from multiple, distributed and heterogeneous sources is KRAFT [63]. KRAFT 
has an agent-based architecture, in which all knowledge processing components are 
realized as software agents. The architecture uses constraints as a common knowledge 
interchange format, expressed in terms of a common ontology. Knowledge held in local 
sources can be translated into the common constraint language, fused with knowledge 
from other sources, and is then used to solve a specific problem, or to deliver some 
information to a user. The generic framework of the architecture can be reused across a 
wide range of knowledge domains and has been used in a network data services appli- 
cation as well as in prototype systems for advising students on university transfers, and 
for advising health care practitioners on drug therapies. The implementation of KRAFT 
is based on the F1PA standard with RDF as a content language. 

Sharing knowledge between people can take place directly, e.g., in face-to-face col- 
laborations or with synchronous media like video conferencing, or indirectly, e.g., via 
information objects that are exchanged. Even hybrid approaches are possible, for ex- 
ample by analyzing the use of information objects and establishing direct links between 
people using the same objects. This direction was investigated in the Campiello project 
[46]. Campiello aims at using innovative information and communication technology 
to develop new links between local communities and visitors of historical cities of art 
and culture. The objectives of the project are to connect local inhabitants of historical 
places better, to make them active participants in the construction of cultural informa- 
tion and to support new and improved connections with cultural managers and tourists. 
The system includes a recommender module, a search module, and a shared data space. 
In order to facilitate the integration, tailoring and extensibility of these components, 
an agent model was chosen for the services in Campiello. The architecture supports 
interaction between distributed, heterogeneous agents and is built on top of the Voy- 
ager platform 10 which was extended towards an agent platform by adding directory and 
broker services, administration tools and agent classes. 

In an organizational environment, one of the main context aspects is the business 
process a knowledge worker is involved in. Business process-oriented Knowledge Man- 
agement (BPOKM, cf. [5]) considers these processes i) as knowledge objects them- 
selves, ii) as knowledge creation context, iii) as trigger, when some knowledge objects 
may be relevant, and iv) as context what knowledge may be relevant. The EULE system 
[67] shows an integration of business process modeling and knowledge management. 
The system takes a micro-level view on business processes by modeling and support- 
ing “office tasks’’ of a single worker by just-in-time information delivery, but does not 
coordinate complete workflows performed by groups of people. While EULE is not 
an explicitly agent-based system, in the FRODO framework for Distributed Organiza- 
tional Memories [3] workflows themselves are first-order citizens in an agent-society 
for KM in distributed environments. An Organizational Memory in FRODO can be seen 
as a meta-information system with tight integration into enterprise business processes, 

10 http://www.recursionsw.com/products/voyager/voyager.asp 
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which relies on appropriate formal models and ontologies as a basis for common under- 
standing and automatic processing capabilities [1], Figure 4 shows FRODO’s four layer 
architecture for each Organizational Memory (OM): i) The application layer manages 
the process context in form of weakly-structured workflows [32]. ii) The source layer 
contains information sources with various levels of formalization (process models, text 
documents, etc.), iii) The knowledge description layer provides uniform access to the 
sources by means of ontologies, iv) By utilizing these descriptions, the knowledge ac- 
cess layer connects the application with the source layer. Agents in a FRODO OM 
reside on all four layers: 

- Workflow-related agents (task agents, workflow model manager, ...) are on the ap- 
plication layer and control the execution of business processes. 

- Personal User Agents are also on the application layer and provide the interface to 
the individual knowledge worker. 

- On the knowledge access layer. Info Agents and Context Providers realize retrieval 
and other information processing services to support the task and user agents. 

- The knowledge descriptions are handled by Domain Ontology Agents. Dedicated 
Distributed Domain Ontology Agents serve as bridges between several OMs. 

- Wrapper Agents and Document Analysis and Understanding Agents enable access 
to the sources and informal-formal transitions of information, and are thus located 
in the knowledge object layer or at the intersection between knowledge objects and 
knowledge descriptions, respectively. 

In order to cope with the heterogeneity and complexity, as well as to constrain the 
overall behavior of the system, agents in FRODO are organized in societies. Therefore, a 
FRODO agent is not only described by its knowledge, goals and competencies, but also 
by its rights and obligations. The description of ontology societies in [30] exemplifies 
FRODO’s concept of socially-enabled agents for KM. The implementation is based on 
the FIPA-compliant agent platform JADE 11 . 

FRODO’s approach towards Distributed Organizational Memories is strongly driven 
by the general considerations of KM presented in Section 1 . The overall goal is to find 
a balance between the organizational KM needs and the individual needs of know- 
ledge workers. This is reflected in the way domain ontologies are handled in the dis- 
tributed environment. Coming from a comparable analysis of KM characteristics [11], 
the Edamok project 12 also aims at enabling autonomous and distributed management 
of knowledge. Edamok completely abandons centralized approaches, resulting in the 
peer-to-peer architecture KEx [10]. Each peer in KEx has the competence to create 
and organize the knowledge that is local to an individual or a group. Social structures 
between these peers are established that allow for knowledge exchange between them. 
In addition to the semantic coordination techniques that are required for this approach, 
the Edamok project also investigates contextual reasoning, natural language processing 
techniques and methodological aspects of distributed KM. 

An approach which is closely related to FRODO and Edamok has been developed 
in the CoMMA project [9]. The CoMMA architecture also employs societies of agents 
for personalized information delivery [38]: 

11 http://sharon.cselt.it/projects/jade/ 

12 http://edamok.itc.it/ 
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Fig. 4. FRODO Architecture for a Single Organizational Memory 



- Agents in the ontology dedicated sub-society are concerned with the management 
of the ontological aspects of the information retrieval activity. 

- The annotation dedicated sub-society is in charge of storing and searching doc- 
ument annotations in a local repository and also of distributed query solving and 
annotation allocation. 

- The connection dedicated sub-society provides white page and yellow page ser- 
vices to the agents. 

- The user dedicated sub-society manages user profiles as well as the interface to the 
knowledge worker. 

The sub-societies in CoMMA can be organized hierarchically or peer-to-peer [ 39] . The 
position of an agent in a society is defined by its role [37]. The system was implemented 
on top of the JADE agent platform, and special attention was paid to the use of XML 
and RDF for representing document annotations and queries. 

As already stated above, business processes play an important role for providing 
context of knowledge generation and reuse. The utilization of the process context ranges 
from rather static access structures to knowledge objects (e.g., as a browsing hierarchy 
in a portal, or as an annotation that can be exploited by a search agent) to workflow-like 
agent-supported execution and the triggering of proactive information delivery. 

An interesting system that uses a concrete, domain-specific process model for its in- 
formation support is K-InCA [71, 6], In K-lnCA, agents are used to guide, monitor 
and stimulate managers towards the understanding of KM concepts and the adoption 
of KM practices in organizational contexts, so that the system behaves as a personal 
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KM coach for its users. The underlying process model of a K-InCA agent describes 
how changes are adopted by individuals and thereby new knowledge is incorporated 
into a person’s spectrum of working habits. K-InCA agents can be seen as experts on 
organizational behavior and change management, assisting users in the transition from 
their current working habits to new habits that integrate some new behavior (e.g. KM 
practices, entrepreneurial attitude, etc.). The system allows for different modes of inter- 
action (practice and coaching), aiming at bringing the user to adopt a desired behavior. 
In order to achieve this goal, agents react to the current user activity on the basis of 
information stored in a domain model and a user model, as well as through interaction 
with other agents. 

With respect to the question of where in the development cycle the notion of agents 
is used (cf. Section 2.1), most of the systems presented up to now take a kind of middle- 
out approach: All of them have an agent-based description of the system’s components. 
This description is partly motivated by a functional decomposition from an IT point of 
view and partly a result of reflecting real-world entities (users, groups, etc.) in the sys- 
tem. Some of these architectures are then implemented using “conventional” software 
technology (e.g., most user interface agents), others build upon dedicated platforms for 
agent systems (e.g., based on the FIPA 13 specifications). Only a few of the described 
systems complement their architectures with an agent-based Knowledge Management 
methodology for guiding the development of such a system in an organizational context 
(e.g., Edamok, stemming from the general MAS methodology Tropos and developing 
it towards KM). 
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13 http://www.fipa.org/ 
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A recent proposal for a design methodology specifically tailored to agent societies is 
OperA [25]. This methodology is based on a three-tiered framework for agent societies 
that distinguishes between the specification of the intended organizational structure and 
the individual desires and behavior of the participating agents: 

1 . The organizational structure of the society, as intended by the organizational stake- 
holders, is described in the Organizational Model (OM). 

2. The agent population of an OM is specified in the Social Model (SM) in terms 
of social contracts that make explicit the commitments which are regulating the 
enactment of roles by individual agents. 

3. Finally, given an agent population for a society, the Interaction Model (IM) de- 
scribes possible interaction between agents. 

After all models have been specified, the characteristics and requirements of the so- 
ciety can be incorporated in the implemented software agents themselves. Agents will 
thus contain enough information and capability to interact with others according to the 
society specification. Figure 5 depicts the relation between the different models. The 
OperA methodology supports the specification of an Organizational Model by analyz- 
ing a given domain and determining the type and structure of the agent society that best 
models that domain is described in [29]. The methodology provides generic facilitation 
and interaction frameworks for agent societies that implement the functionality derived 
from the co-ordination model applicable to the problem domain. Standard society types 
such as market, hierarchy and network, can be used as starting points for development 
and can be extended where needed and determine the basic norms and facilitation roles 
necessary for the society. These coordination models describe the different types of 
roles that can be identified in the society and issues such as communication forms, de- 
sired social order and co-operation possibilities between partners. The OperA method- 
ology and framework have been applied to the design of Knowledge Market , an agent 
society to support peer-to-peer knowledge sharing in a Community of Practice; this 
has been designed in such a way that it preserves and recognizes individual ownership 
of knowledge and enables the specification and monitoring of reciprocity agreements 
[26], 

3.4 Description of Example Systems: Concluding Remarks 

In Section 2, we presented a framework for the description of agent-based Knowledge 
Management systems with the main dimensions system development level , macro-level 
structure, and KM application area. The analysis of several KM systems in Sections 
3. 1-3.3 shows that this space is not fully covered by the research approaches and pro- 
totypes presented (see also Table 1). Two factors may contribute to this fact: 

1 . Though at first glance, only the last dimension — the application area — seems to 
be KM specific, the dimensions are not really independent. If for example, know- 
ledge use and internalization by specialized presentation techniques is the focus 
of research, an “agentihcation” of all knowledge sources may well be technologi- 
cal overkill. Or, the other way around, comprehensive KM frameworks may require 
more powerful agent architectures to cope with the complexity of various KM tasks. 
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Table 1 . Typical Operation Points within the Design Space of Agent-based KM Sys- 
tems 



2. Sparsely populated areas in the design space spanned by the description framework 
just may not yet be investigated by current research. 

While the first case covers operating points that simply make no sense for agent-based 
Knowledge Management, the second may lead towards new research aspects. We think 
that some papers in this book are well suited for stimulating thoughts in new directions. 
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4 Summary and Outlook 

The goal of this paper was twofold, i) to clarify the relationship between typical charac- 
teristics of Knowledge Management environments and core features of software agents 
as a basic technology to support KM, and ii) to provide a framework for the analysis 
and description of agent-based KM systems. 

In Section 1 we emphasized four main characteristics of Knowledge Management 
which in our opinion fundamentally account for the suitability of agent-based systems 
for supporting KM: 

- The distributed nature of knowledge may — from a technical point of view — 
raise special challenges, but for an organization and its individuals it is the only 
way to cope with the complexity of knowledge and should therefore be seen as 
an imperative and not as a nuisance. Agents are a natural form to represent that 
knowledge is created and used by various actors with diverse objectives. Socially- 
enabled agents can also help to tackle derived questions like accountability, trust, 
etc. 

- The inherent goal dichotomy between business processes and KM processes leads 
to the fact that knowledge workers typically do not adopt KM goals with a high pri- 
ority. Proactive agents may be able to stand in for (or at least remind the knowledge 
worker) when KM tasks fall behind. 

- Knowledge work as well as KM in general is “wicked problem solving ” without a 
fixed a-priori description of goals and solution paths. Reactive and proactive behav- 
ior of agents help to reach the necessary degree of flexibility. Social skills of agents 
can facilitate the management of the complexity of interactions that are typical for 
wicked problem solving. 

- The continuously changing environments are not entirely an intrinsic KM charac- 
teristic, but nevertheless any IT support for KM has to deal with this given factor. 
Agent approaches allow for extensibility and openness in situations where it is im- 
possible to know at design time exactly which components and uses the system will 
have. 

In Section 2 we developed a framework for the description of agent-based KM 
systems with the main dimensions 

- system development level (analysis, design, implementation), 

- macro-level structure (single agent, heterogeneous, or homogeneous MAS), and 

- KM application area (knowledge distribution, generation, use, etc.). 

The synopsis of exemplary agent-based KM systems in Section 3 with respect to these 
dimensions showed how the design space is covered by today’s research approaches, 
prototypes and systems. Though most applications are not entirely agent-based from 
organizational analysis to system implementation, the potential of agent technology in 
all phases was demonstrated. On the other hand it is a fact that the vast majority of 
KM applications nowadays is not explicitly agent-based. Thus, there is still much work 
to be done in order to fathom the capability of agent technology for KM information 
systems. 
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As the development of a comprehensive “Agents-in-KM Roadmap” is well beyond 
the scope of this paper, we just briefly sketch a couple of directions that may be inter- 
esting for future research: 

1. Socio-technical. How can the teamwork of human knowledge workers and artificial 
agents (that might act “on behalf of” people) be balanced? Questions from human- 
computer interaction arise here, but also questions of trust, responsibility, etc. 

2. Agent technology and KM functionality: What agent models and architectures are 
needed for what kind of KM application? Should concepts of trust, responsibility, 
rights, obligations be integrated in the models? How can the flexibility of reactivity 
and proactivity be better exploited for KM tasks? Which new functionalities can 
agent-based systems offer to KM? 

3. Methodological and engineering aspects: Which functionalities can be provided 
as a kind of “KM middleware” or as modules for building KM applications? How 
should agent-orientation of design and implementation be reflected in an “agent- 
based KM methodology” in order to facilitate transitions between different phases 
in the development cycle? 

4. Evaluation of agent-based KM: How well does the integration of (non agent- 
based) legacy systems into agent environments work in real-world applications 
(case studies)? How easily can new agent-based components really be integrated 
into an existing system? Which evaluation paradigms can be used to make different 
KM applications more comparable (agent-based vs. agent-based, but also agent- 
based vs. “traditional”)? 

At the moment it is hard to argue (and indeed not aimed at in this paper) that agent- 
based systems can do things that could not also be done using conventional technology, 
especially when only the implementation level is considered. However, we believe that 
agent technology helps building KM systems faster and more flexibly. We think that 
the results presented in this paper and in the other contributions in this book have the 
potential to strengthen the hope that an agent-oriented view (regardless of the imple- 
mentation technology) leads to a more human-centered, more agile, and more scalable 
KM support. 
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Abstract. Distributed Knowledge Management is an approach to 
knowledge management based on the principle that the multiplicity (and 
heterogeneity) of perspectives within complex organizations should not 
be viewed as an obstacle to knowledge exploitation, but rather as an 
opportunity that can foster innovation and creativity. Despite a wide 
agreement on this principle, most current KM systems are based on the 
idea that all perspectival aspects of knowledge (including the process of 
its creation) should be eliminated in favor of an objective and general 
representation in a sort of corporate knowledge base. In this paper we 
criticize this approach, and propose a peer-to-peer architecture (called 
KEx), which implements a distributed approach to Knowledge Managa- 
ment in a quite straightforward way: (i) each peer (called a K-peer) pro- 
vides all the services needed to create and organize “local” knowledge 
from an individual’s or a group’s perspective, and (ii) social structures 
and protocols of meaning negotiation are introduced to achieve semantic 
coordination among autonomous peers (e.g., when searching documents 
from other K-peers). A first version of the system, called KEx, is imple- 
mented as a knowledge exchange level on top of JXTA. 



1 Introduction 

Distributed Knowledge Management (DKM), as described in [10], is an approach 
to knowledge management (KM) based on the principle that the multiplicity 
(and heterogeneity) of perspectives within complex organizations should not be 
viewed as an obstacle to knowledge exploitation, but rather as an opportunity 
that can foster innovation and creativity. 

The fact that different individuals and communities may have very differ- 
ent perspectives, and that these perspectives affect their representation of the 
world (and therefore of their work) is widely discussed - and generally accepted 
- in theoretical research on the nature of knowledge. Knowledge representation 
in artificial intelligence and cognitive science have produced many theoretical 
and experimental evidences of the fact that what people know is not a mere 
collection of facts, as any “fact” always presupposes some (typically implicit) in- 
terpretation schema, which provide an essential element of sense-making (see, for 
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example, the notions of context [25,20,2], mental space [19], partitioned repre- 
sentation [15]); studies on the social construction of knowledge stress the social 
nature of interpretation schemas, viewed as the outcome of a special kind of 
“agreement” within a community of knowing (see, for example, the notions of 
scientific paradigm [23], frame [22]), thought world [17], perspective [5]). 

Despite this large convergence, it can be observed that the high level architec- 
ture of most current KM systems in fact does not reflect this vision of knowledge 
(see [9,10,7] for a detailed discussion of this claim). The fact is that most KM 
systems embody the assumption that, to share and exploit knowledge, it is nec- 
essary to implement a process of “knowledge extraction and refinement” , whose 
aim is to eliminate all subjective and contextual aspects of knowledge, and create 
an objective and general representation that can then be reused by other people 
in a variety of situations. Very often, this process is finalized to build a central 
knowledge base, where knowledge can be accessed via a knowledge portal. In our 
opinion, this centralized approach - and its underlying objectivist epistemology 
- is one of the reasons why so often KM systems are deserted by users. 

In this paper we describe a peer-to-peer (P2P) architecture, called KEx, 
which is coherent with the vision of DKM. Indeed, P2P systems seem partic- 
ularly suitable to implement a DKM system. In KEx, each community is rep- 
resented by a knowledge peer (K-peer), and a DKM system is implemented in 
a quite straightforward way: (i) each K-peer provides all the services needed 
by a knowledge node to create and organize its own local knowledge, and (ii) 
social structures and protocols of meaning negotiation are introduced to achieve 
semantic coordination (e.g., when searching documents from other peers). A 
first version of KEx has been implemented on top of JXTA, a P2P open source 
project started in 2001 and supported by Sun (see http://www.jxta.org/). 

The paper goes as follows: first, we briefly discuss the centralized vs. dis- 
tributed paradigm in KM; second, we describe the main features of KEx, a peer- 
to-peer system for knowledge discovery and exchange, and argue why it provides 
a suitable system for distributed KM; then we describe the implementation of 
KEx; finally, we draw some conclusions and future work. 

2 Social and Technological Architectures for KM 

The starting point of our analysis is the wide agreement - in the organizational 
and sociological literature - on the fact that the creation, codification, and shar- 
ing of knowledge within complex organizations is a process that can be described 
along two qualitatively different dimensions: 

— on the one hand, knowledge is developed within communities, namely groups 
of people that share a common perspective (e.g., because they have a com- 
mon goal, a common education, a common culture) . This process, called per- 
spective taking in [5], corresponds to the incremental development of knowl- 
edge within a community, an idea closely related to the notion of normal 
science within a paradigm proposed by the philosopher T. Kuhn with re- 
spect to the development of scientific theories [23] ; 
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— on the other hand, knowledge is developed as a consequence of the interaction 
between different communities. This process, called perspective taking in [5], 
corresponds to a discontinuity in a community’s development (in science, 
Kuhn would call it a scientific revolution). It is not as common (and is 
definitely harder) of the first one, as it requires the ability of “mapping” the 
point of view of another community into another community’s perspective, 
an operation that presupposes the cognitive ability of “transcending” a local 
perspective and making explicit the assumptions that, within a community, 
all take for granted. 

An important assumption underlying our work, which we share with the 
structurationist approach [27] in organization sciences, is that technology and 
organization are tightly interrelated dimensions, which need to be reciprocally 
coherent. The more an organizational process involves high level human activi- 
ties, the stronger the interdependence between technological and organizational 
dimensions is. In particular, since each approach to cognition makes specific as- 
sumption on the role of communication, a technology that strongly structures 
social communication implies a particular model on how cognition occurs [6] . 

From this point of view, we suggest that the main problem of most current 
KM systems is that they do not support the two social ( “pre-technological” ) 
knowledge processes described above, but rather tend to impose a process of a 
very different nature, namely a process whereby people: 

— generate knowledge through peripheral socialization in communities of prac- 
tices [33,14]: through work practice, employees generate implicit knowledge 
in terms of working solutions that can be fruitfully made explicit and thus 
reusable; 

— contribute with their knowledge through a codification process: knowledge 
is categorized and validated by experts according to a corporate language; 

— retrieve knowledge using a unified access to the organizational memory: 
through the use of manuals, procedures, routines, or the access to formal 
training, people have access to corporate knowledge. 

Technological architectures are then designed in accordance with this view 
of organizational cognition. The result is a semantically centralized KM archi- 
tectures which aims at: 

— creating and enabling communication within formal and informal groups and 
communities (e.g., through “virtual communities” and groupware applica- 
tions, which allow individuals to interact and produce their “raw” peripheral 
knowledge); 

— collecting “raw” peripheral knowledge through participation. Workers can 
contribute to create and feed knowledge using automatic document manage- 
ment tools, clustering, text mining, and information retrieval applications to 
explicit and collect knowledge; 

— categorizing and storing knowledge in databases and repositories according 
to a common and shared system of meaning, this way distilling knowledge 
that is useful for the entire organization; 
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— designing a corporate system of meanings in the form of a common language, 
an ontology, a knowledge map, a categorization, or a classification system 
that is necessary to codify knowledge according to a shared interpretative 
schema; 

— creating an Enterprise Knowledge Portal (EKP) that provides a unique, 
standard access point to corporate knowledge for the members of different 
organizational units. Typically, people access the KB through various forms 
of personalization tools (e.g., individual or group profiles, views, chats, and 
so on). 

Through the analysis of a paradigmatic case study (a worldwide consulting 
firm), [9] shows that centralized KM systems are often deserted by end-users; 
indeed, as Bowker and Star argue in [13], any approach which disregards the 
plurality of interpretative schemas is perceived either as irrelevant (there is no 
deep understanding of the adopted and centralized schema), or as oppressive 
(there is no agreement on the unique schema, which is therefore rejected). 

Recently, different groups of researcher are starting to realize that we need 
technological architectures that are more coherent with the social model of or- 
ganizational cognition. In [9,8], a distributed approach is proposed, in which 
organizational cognition is viewed as a distributed process that balances the 
autonomous knowledge management of individual and groups, and the coordi- 
nation needed to exchange knowledge across different autonomous entities; from 
this perspective, technology is viewed as a way enabling distributed control, dif- 
ferentiation, customization, and redundancy. In such a vision, technology should 
mainly support the autonomous creation and organization of knowledge locally 
produced by individuals and groups and, on the other hand, support coordi- 
nation processes among autonomous entities, in order to exchange and share 
knowledge. In particular this means: 

— giving each community the possibility to represent and organize knowledge 
according to its goals and interpretative perspective. The building blocks of 
the system are the so-called knowledge nodes [7] (KNs), namely the organi- 
zational units - either formal (e.g. divisions, market sectors) or informal (e.g. 
interest groups, communities of practices, communities of knowing) - which 
exhibit some degree of semantic autonomy 3 ; 

— providing tools to support the exchange of knowledge across different KNs 
without assuming shared meanings, but rather enabling the dynamic transla- 
tion of different meanings. The KNs are thus materialized by local technolo- 
gies that represent a semantically autonomous expression of local knowledge 
owned by an individual or a group; 

— setting mechanisms and protocols to enable the emergent and bottom-up for- 
mation of informal communities and communication practices (such as find- 
ing or addressing people to trusted individuals/communities). Here a DKM 

3 Semantic autonomy means the ability to develop autonomous interpretative schemas 
(perspectives on the world) to interpret, organize, and store useful information. 
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system supports the formation of groups and knowledge discovery/propaga- 
tion through social cooperation. 

In the following section we show how this architecture has been implemented 
in a P2P system, which provides a peer-mediated support to a distributed ap- 
proach to designing KM systems. In the conclusions, we will make some remarks 
on the relation between P2P and agent-mediated knowledge management, and 
why we went for the first. 



3 KEx: A P2P Architecture for DKM 

KEx is a P2P system which allows each KN (be it an individual or a community) 
to build its own knowledge space within a network of autonomous K-peers, to 
make knowledge in this space available to other K-peers, and to search relevant 
knowledge in the knowledge space of other K-peers. We stress the fact that 
a K-peer may contain not only a structured collection of documents, but also 
relational knowledge, such as references to experts in some domain, links to other 
K-peers, to external resources, and so on. 

In the following sections, we describe the high-level architecture of KEx, and 
explain the role that each element plays in a DKM perspective. 



3.1 K peers 

K-peers are the building blocks of KEx. From an organizational perspective, each 
K-peer represents a Knowledge Node [7], namely the reification of an organiza- 
tional unit - either formal (e.g. divisions, market sectors) or informal (e.g. inter- 
est groups, communities of practices, communities of knowing) - which exhibits 
some degree of semantic autonomy, namely the ability to develop autonomous 
interpretative schemas (perspectives on the world) to interpret, organize, and 
store useful information. 

In KEx, each K-peer can play two main roles: provider and seeker. A K- 
peer acts as a provider when it “publishes” in the system a body of knowledge, 
together with an explicit semantic view on it (called a context, in the sense 
defined in [11]); a K-peer acts as a seeker when it searches for information that 
matches some part of its context. 

Each K-peer has the structure shown in Figure 1. Below we illustrate the 
main modules. 



Document Repository The Document Repository is the place where the data 
of a knowledge node are stored. In general, it can be viewed as a private space in 
which document and other data are organized according to some local semantic 
schema (e.g., a directory structure, or a database schema) and managed by 
some local application (e.g., a DBMS, a HTTP server, a file system, a document 
management system) . 
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Fig. 1 . KEx main components 



Context Repository A context is an explicit semantic schema over a body of 
local knowledge. More that one context can be used to classify local knowledge. 
The contexts in use in a K-peer are stored in a context repository. We observe 
that local knowledge definitely includes documents from the document reposi- 
tory, but it may also include links to other resources and mappings to contexts 
stored in the context repository of other KNs. 

To make contexts usable in KEx, we use a web-oriented syntax for them, 
called CTXML [11]. CTXML provides an XML-Schema specification of a con- 
text; currently, contexts are concept hierarchies, whose nodes are labelled with 
words and phrases from natural language, arcs are Is-A, Part-Of or generic re- 
lations between nodes. 

From an organizational point of view, a context is the manifestation of a KN’s 
semantic autonomy. Even though a context can be a newly defined schema, in 
typical situations a context is a “translation” in CTXML of the local application 
schemas. For example, a context can be the representation of a user’s file system 
(where directory names are used as concept names and the sub-directory struc- 
ture is used as the structure of the concept hierarchy); or a context can be the 
representation in CTXML of the taxonomy of a document management system 
(where the taxonomy is used as a structure, and relations are Is-A relations). 
From the standpoint of DKM, contexts are relevant in two distinct senses: 

— on the one hand, they have an important role within each KN, as they provide 
a dynamic and incremental explicitation of its semantic perspective. Once 
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contexts are reified, they become cognitive artifacts that contribute to the 
process of perspective making [5] , namely the consolidation of a shared view 
in a KN, continuously subject to revision and internal negotiation among its 
members; 

— on the other hand, contexts offer a simple and direct way for a KN to make 
public its perspective (s) on the information that it can provide. There- 
fore, as we will see, contexts are an essential tool for semantic coordina- 
tion/negotiation among different KNs. 

Context Management Module. The context management module allows 
a users to create, manipulate, and use contexts. This module has two main 
components: 

— Context editor: the context editor provides users with a simple interface to 
create and edit contexts, and to associate documents and other information 
with respect to a context. This happens by allowing users to create links 
from a resource (identified by a URI) to a node in a context. Examples 
of resources are: documents in local directories, the address of a database 
access services, addresses of other K-peers that provide information that a 
KN wants to explicitly classify in its own context. 

— Context Normalization and Enrichment: this module provides two im- 
portant services for achieving semantic coordination among K-peers. The 
first, called normalization, uses NL techniques (e.g., deleting stop words, to- 
kenizing, tagging part-of-speech, etc.) on user defined contexts; the second, 
called enrichment, provides an interface with an external linguistic resource 
(in the current version of the system we use WordNet) or with an ontology 
to add semantic information to concept labels (for example, that in a given 
context “apple” means a fruit and not a computer brand). Both steps are 
described in detail in [24]. 

This notion of enrichment is not equivalent to introduce a shared (universal) 
semantics in KEx. Indeed, the intuition is that the meaning of a concept label 
in a context has two components: 

— the first is the linguistic component, which means that the words or phrases 
used as concept labels have a standard meaning (or, better, a set of meanings) 
in a “dictionary”. This helps, for example, to distinguish between “apple” 
as a fruit and “apple” as a tree; 

— the second is a sort of pragmatic component, which is given by its position 
in a context (e.g., in a concept hierarchy in CTXML). This helps in under- 
standing what the user means on a particular occasion with a word (e.g., 
“apple” in a path like “computer/software/apple” is different from “apple” 
in a path like “computer/lrardware/printers/apple”, even though “apple” 
has the same dictionary meaning’) . 

The first component is public, namely is shared among those who speak a 
given language. The second is highly contextual, and cannot be computed a pri- 
ori, namely independently from the context in which it appears. In this sense, 
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contexts are not to be thought of as an alternative to the use of ontologies in 
a KM system, but as a necessary integration 4 . Indeed, a typical context does 
not provide semantic information on the terms used in a concept hierarchy nor 
on their relations, but only on how these terms are combined to create more 
complex concepts in a classification schema. Consider again the path “com- 
puter/lrardware/printers/apple”. Even if we can decide that “apple” is used in 
the sense of a computer brand, this path does not provide a definition of what a 
computer brand is, whereas this definition could be found in a general ontology. 
However, it is clear that understanding what is the concept associated to such a 
path requires to use a lot of ontological knowledge about computers, hardware, 
printers, and their relation in the domain of computers (e.g., that printers are a 
kind of hardware, that there are different printer makers, that Apple is one of 
them, and so on). Indeed, in the context matching algorithm we developed for 
coordinating K-peers (see below), both ontological and contextual information 
are used to find relations over concepts in different contexts. 

It is also important to notice that different linguistic resources or ontologies 
can be used to enrich a context. So far, we’ve been using WordNet, but there’s 
no reason why other resources can’t be used to replace WordNet. From the 
standpoint of KM, this is an interesting feature, as we can imagine that different 
communities may decide to adopt a different linguistic resource or ontologies to 
enrich their contexts, namely those which better suit their needs. This fact has 
a significant impact on the mechanisms for sharing knowledge across K-peers, 
as we will discuss later in this paper. 



4 Roles of K peers in KEx 

Each K-peer can act as a provider, as a seeker, or can play simultaneously 
both roles. The main components of the two roles are described in Figure 1; the 
interaction between seekers and providers is described in Figure 2. The following 
sections explain in details the components needed for implementing the two roles, 
and their interaction. 



4.1 Seeker 

The seeker module of KEx allows users to search and retrieve useful knowledge 
from other K-peers. The main module of the seeker component is the query 
maker. A query is built as follows: 

— the user opens a context from the context repository; 

— the user browses the context until a concept (i.e., a node in the context) is 
found which describes the topic of interest for that query; 

— the user clicks on that concept, this way telling the system that she is inter- 
ested in finding documents about that concept; 

4 See [12] for a preliminary investigation of liow this integration can be done. 
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— the system extracts a focus, namely the portion of the selected context which 
contains relevant information about the selected concept (in the current 
version of the query maker, the focus is the entire path from the concept to 
the root of the concept hierarchy, (see [24] for a formal definition of focus); 

— if needed, the user can add one or more keywords to the query. 

When the user submits the query, the seeker activates a session that is associ- 
ated to that query and is in charge of resolving it (step 1 in Figure 2). The query 
is propagated to the P2P system (see below for details). The active session can 
receive asynchronously several incoming replies from those providers that have 
been selected/suggested by other peers, and collects results that are composed 
by the aggregation of all those that have been received; each result is made up 
of a list of document descriptors (name of the document, short description, and 
so on) and the indication of the part of the providers context that has been used 
in order to interpret the meaning of the query and provide a resolution. Finally 
the seeker allows the user to access the K-Peer downloading service; if the user 
finds in the result set one or more interesting documents, she can contact the 
providing K-Peer to download it. 

4.2 Provider 

The provider contains the functionalities needed to accept and resolve a query, 
and to identify the results that must be returned to the seeker. When a K-peer 
receives a query (keywords and focus), it instantiates a provider, configured to 
use a set of contexts and some documents (a particular directory), and to resolve 
the query. 

The main modules needed for the provider role are the following: 

— Query Solver: the query solver takes a contextual query as an input and 
returns a list of results to be sent back to the seeker. A contextual query is 
resolved: 

• by the Semantic Query Solver, if a focus is associated to the query 
itself; 

• by the Lexical Query Solver, if a list of keywords is associated to the 
query. 

If the query contains both a focus and a list of keywords, then both solvers 
are invoked and the result set is the intersection of the two result sets. 

— Query propagation: this module is in charge of propagating a query to 
other K-peers (see K-services below for the two modes of propagation al- 
lowed in KEx). 

The Semantic Query Solver invokes a context matching algorithm, namely an 
algorithm that finds relations between concepts belonging to different contexts. 
The details of the algorithm are described in [30]. Here we only say that the 
algorithm is based on two main ideas: 

— that the information provided in a context presupposes a lot of implicit 
knowledge, which can be extracted automatically from an external resource 
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(in our case, WordNet). In the current version of the algorithm, the result 
of this extraction, combined with the explicit information of the context, 
is represented as a logical formula associated to each node in the concept 
hierarchy of a context; 

— that the problem of discovering what the relationship is between concepts of 
different contexts can be codified as a problem of propositional satisfiability 
of a set of formulae, namely the formulae associated to the concepts to be 
matched plus a set of “axioms” obtained by extracting relevant facts from 
the external resource. 

Given two concepts in two different contexts, the current version of the al- 
gorithm we implemented returns one of the following relations as its output: (i) 
the two concepts are equivalent, (ii) the first is more general than the second, 
(iii) the first is less general than the second, (iv) the two concepts are disjoint. In 
KEx, if one of the first three relations holds, then the URIs of the resources as- 
sociated to the concept on the provider side are returned to the seeker, together 
with the focus of the concept in the provider’s context. This is important, as 
users on the seeker side may have the opportunity to learn how users on the 
provider side classify a document in their context. 

It is important to observe that the semantic match is performed on the 
provider side. This means that the result reflects the provider’s interpretation 
of the seeker’s query. This explains why we can match contexts normalized and 
enriched with different linguistic resources or ontologies. The intuition is that, 
in this case, the provider will normalize and enrich the query’s focus using its 
own resource, this way assigning to it a meaning from the perspective of its 
users. Of course, the seeker’s users can disagree with the resulting match (if 
any). However, this is not dissimilar from what happens among human agents, 
as it may happen that an hearer’s answer is completely incompatible with the 
speaker’s intended meaning for the question. 

The interaction between seeker and provider is depicted in Figure 2. The 
seeker sends a query to a provider (step 1). When a provider receives a query, 
it starts a query resolution session and selects relevant documents (step 2). The 
provider sends back to the seeker the result set (step 3) . A provider can propagate 
a query to other providers that, from its perspective, are “experts” about the 
query’s topic (step 4). Each provider to which the query is propagated activates 
a query resolution session, and sends the results to the seeker. 

4.3 K— services 

KEx provides a collection of knowledge related services that have an important 
role in supporting knowledge exchange in a network of autonomous K-peers. 
The more important among them are described in the following sections. 

K federations. KEx provides a federation management service. A K-federation 
is a group of K-Peers that agree to behave like a unique entity when other K- 
peers perform a search. In other words, each K-federation is a “social” aggre- 
gation of K-peers that display some synergy in terms of content (e.g., as they 
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Provider 




Fig. 2. The KEx system: interaction between Seeker and Provider roles 



provide topic-related content or decided to use the same linguistic resource to 
create a common “vocabulary” , thus providing more homogeneous and specific 
answers), quality (certify content) or access policies (certify members). In this 
sense, the addition of federations to KEx is not trivial, as they embody a first 
(simple) form of semantic-driven aggregation. 

To become a member of a K-federation, a K-Peer must provide a K-federation 
Service (quite similar to that required by the Provider role) that implements the 
required federation protocol (reply to queries sent to the K-federation) and ob- 
serves the federation membership policy. Each K-peer can be member of more 
than one K-federation. 

Currently, we do not make any assumption on how K-federations are formed 
(see, for example, [29,16] for two different methods of automatic and dynamic 
formation of communities). However, we anticipate two principal methods of 
constitution: 

Top-down: a group of K-peers decide to create a federation for organizational, 
commercial, or strategic reasons. Depending on the type of agreement, a new 
K-federation can be created and membership policies can be defined; 
Bottom-up: there are tools that observe the interactions among K-peers and 
detects the emergence of synergies among K-peers. In a corporation, this 
may suggest the existence of a new (informal) community, and lead to the 
creation of a K-federation to support it. 

From a technological point of view, a K-federation becomes a new possible 
target for a seeker’s queries. And indeed, as we can see from Figure 1, the 
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federation module uses the provider module, the main addition being that queries 
are forwarded to all members of the federation. Currently, the result of sending 
a query to a K-federation is the very similar to the result of sending the query 
directly to each member of the federation. Two are the main differences: on the 
one hand, each K-peer can select which of its contexts and documents can be 
used to answer a query that comes from each federation to which it belongs; on 
the other hand, K-peers that reply to a query notify seekers that they replied as 
members of that K-federation. 

In the future, we plan to implement smarter policies for handling queries 
within K-federations, like a query pre-processing or a selection of the members 
to which the query should be forwarded. 



Discovery. Discovery is a mechanism that allows users to discover resources 
in the P2P network. Users need to discover K-peers or K-federations available 
in the network as potential targets of a query. Each K-peer may advertise the 
existence of a resource by publishing an XML document (advertisement). In 
KEx, two type resources are currently advertised: 

— K peers that have a provider service to solve queries. The main elements 
of the advertisement are: a description of the peers contexts, and an address 
to contact the K-peer in order to send it a query or retrieve documents; 

— K— federations, namely groups of peers that have a federation service to 
solve queries. The main elements of the advertisement are: the federation 
domain description, contact information, membership policy. 

To discover resources in a P2P network, K-peers can send a discovery request 
to an already known K-peer, or send a multi-cast request on the network, and 
receive responses (list of advertisements) that describe the available services 
and resources. It is possible to specify search criteria (such as a keyword or 
textual expression) that are then matched against the contents provided by the 
advertisement related to each K-peer or K-federation description. 



Query Propagation. When a provider receives a query, it can decide to for- 
ward it to another Provider that is considered “expert” about the query’s topic. 
To decide to which peers the query is to be forwarded, a peer has two possibili- 
ties: 



— physical “neighborhood” : the query will be sent to peers known through the 
discovery functionality. This way, providers that are not directly reachable 
by the Seeker, or have just joined the system, can advertise their presence 
and contribute to the resolution of queries; 

— semantic “neighborhood” : if the provider computes some matching between 
a query and a concept in its own context, the system will look for addresses 
of other K-peers that are linked to that concept. If some are found, then 
the query is propagated to them, based on the assumption that K-peers 
classified under a concept may possess relevant information about it. 
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Obviously, there are several parameters and mechanism controlling the scope 
of the search and prevent a message “flooding”: setting a time to live (TTL), 
limiting the number of hops, storing in the query the name of peers that already 
received the query, and so on. 



Learning. When the matching algorithm finds a semantic correspondence be- 
tween concepts of different contexts, the Provider can store this information for 
future reuse. This information is represented as a semantic “mapping” between 
concepts (see [11]), and can be used in three ways: 

— when the K-peer receives a query from a seeker, it can reuse the correspond- 
ing stored mapping to avoid running the matching algorithm; 

— a provider can use the existing mapping to forward the query to other peers 
that present a semantic relation with the topic of the query (see semantic 
propagation above); 

— the seeker can search into the mapping relations in order to suggest the user 
a set of providers with which it had past interactions and are classified as 
qualified with respect to the meaning of the concept selected in a query. 

Using this mechanisms, the K-Peer network will define and increase the 
number and quality of the semantic relations among its members, so that it 
becomes a dynamic web of knowledge links. 

5 Development Framework 

KEx is built on top of JXTA , a set of open, generalized peer-to-peer protocols 
that allow devices to communicate and collaborate through a connecting net- 
work. This P2P framework provides also a set of protocols and functionality 
as a decentralized discovery system, an asynchronous point-to-point messaging 
system, and a group membership protocol. A peer is a software component that 
runs some or all the JXTA protocols; every peer has to agree upon a common 
set of rules to publish, share and access “resources” (like services, data or ap- 
plications), and communicate among each others. Thus, a JXTA peer is used to 
support higher level processes (based, for example, on organizational consider- 
ations) that are built on top of the basic peer-to-peer network infrastructure; 
they may include the enhancement of basic JXTA protocols (e.g. discovery) 
as well as user-written applications. JXTA tackles these requirements with a 
number of mechanisms and protocols: for instance the publishing and discov- 
ery mechanisms, together with a message-based communication infrastructure 
(called “pipe”) and peer monitoring services, supports decentralization and dy- 
namism. Security is supported by a membership service (which authenticates 
any peer applying to a peer group) and an access protocol (for authorization 
control). The flexibility of this framework allows to design distributed systems 
that cover all the requirements of a DKM application, using the JXTA P2P ca- 
pabilities, completed and enhanced through the implementation of user-defined 
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services. As shows in the previous sections, in the Kex system we combine the 
P2P paradigm (characterizing a KN network as a network of distributed peers) 
and JXTA as an implementation infrastructure in a coherent vision with the 
DKM paradigm. 

These features of a peer-to-peer system seem to match the spirit and the 
main non-functional aspects of a KN in a DKM application, and suggest a P2P 
systems as a natural architectural solution (see [3] for a discussion of this idea). 
In particular: 

— autonomy is guaranteed by the fact that each KN can be seen as a peer which 
owns local knowledge, stored and organized through local technologies and 
applications; 

— coordination is guaranteed by enabling peers to collaborate with each other, 
using a set of dynamic and heterogeneous services that peers provides to 
each other, in order to support both communication features (as the discov- 
ery functionality) and semantic services (e.g. exchange information without 
imposing a common interpretation schema, but through a meaning negoti- 
ation service that automatically maps concepts among different systems of 
meanings) . 



6 Conclusions and Research Issues 

In this paper, we argued that technological architectures, when dealing with pro- 
cesses in which human communication is strongly involved, must be consistent 
with the social architecture of the process itself. In particular, in the domain 
of KM, technology must implement a principle of distribution that is intrinsic 
to the nature of organizational cognition. This distributed approach is becom- 
ing more generally accepted, and other groups are working in the direction of 
building distributed organizational memories (see e.g. [18,32]). 

Here we also suggest that P2P infrastructures are especially suitable for dis- 
tributed KM applications, as they naturally implement the principles of auton- 
omy and distribution, it is interesting to observe that also other research areas 
are moving toward P2P architectures. In particular, we can mention the work 
on P2P approaches to the semantic web [1,31], to databases [21], to web services 
[28]. We believe this is a general trend, and that in the near future P2P in- 
frastructure will become more and more interesting for all areas where we can’t 
assume a centralized control. 

A number of research issues need to be addressed to map aspects of dis- 
tributed cognition into technological requirements. Here we propose two of them: 

— social discovery and propagation: in order to find knowledge, people 
need to discover who is reachable and available to answer a request. On 
the one hand, broadcasting messages generates communication overflow, on 
the other hand talking just to physically available neighbors reduces the 
potential of a distributed network. A third option could be for a seeker 
to ask his neighbors who they trust on a topic and, among them, who is 
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currently available. Here the question is about social mechanisms through 
which people find based on trust and recommendation - other people to 
involve in a conversation. A similar approach could be used in order to 
support the propagation of information requests; 

— building communities: if we consider communities as networks of people 
that, to some extent, tend to share a common perspective [5], mechanisms are 
needed to support the bottom-up emergence of semantic similarities across 
interacting KNs. Through this process, people can discover and form virtual 
communities, and within organizations, managers might monitor the evolv- 
ing trajectories of informal cognitive networks. Then, such networks, can be 
viewed as potential neighborhoods to support social discovery and propaga- 
tion. To this end, not only techniques based on the explicitation of semantic 
schemas can be used (e.g., contexts in KEx), but also techniques based on 
the observation of what users do; a possible extension in this direction in 
under development using the notion of implicit culture described in [4] . 

A final remark concerns the relation between agent-based and P2P platforms 
for distributed KM. We believe that P2P infrastructures are a very straightfor- 
ward way of mapping social architectures onto technological architectures, this 
way guaranteeing the coherence between the two structures. From a conceptual 
point of view, peers are much simpler than agents, and do not allow to exploit 
the potential of reasoning tools, coordination and collaboration, planning, that 
agents can provide. However, we need to be very careful in introducing soft- 
ware which can have a significant impact on the way people work and manage 
their own and corporate knowledge. In other words, we need conceptual tools 
for designing agent-based applications which are coherent with the social archi- 
tecture. An interesting attempt to provide such a methodology can be found 
in [26] , where intentional modeling techniques are used to analyze organizations 
as sets of actors that cooperate (or compete) to achieve their goals. This analysis 
is applied in particular to applications of distributed KM. 
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Abstract. Knowledge management systems will presumably benefit from 
intelligent interfaces, including those with animated conversational agents. One 
of the functions of an animated conversational agent is to serve as a 
navigational guide that nudges the user how to use the interface in a productive 
way. This is a different function from delivering the content of the material. 
We conducted a study on college students who used a web facility in one of 
four navigational guide conditions: Full Guide (speech and face), Voice Guide, 
Print Guide, and No Guide. The web site was the Human Use Regulatory 
Affairs Advisor (HURAA), a web-based facility that provides help and training 
on research ethics, based on documents and regulations in United States Federal 
agencies. The college students used HURAA to complete a number of learning 
modules and document retrieval tasks. There was no significant facilitation of 
any of the guides on several measures of learning and performance, compared 
with the No Guide condition. This result suggests that the potential benefits of 
conversational guides are not ubiquitous, but they may save time and increase 
learning under specific conditions that are yet to be isolated. 



1 Introduction 

Knowledge management systems are expected to be facilitated by intelligent 
interfaces that guide users who vary in cognitive abilities, domain, knowledge, and 
computer literacy. Some users will not have the patience to learn systems that are not 
used very often. These users will need fast and easy guidance. Some prefer to talk 
with agents in a conversational style rather than reading dense printed material on a 
computer screen and typing information via keyboard. Therefore, there has been 
serious interest in intelligent interfaces that have speech recognition and animated 
conversational agents. These agents incorporate synthesized speech, facial 
expressions, and gestures in a coordinated fashion that attempts to simulate a 
conversation partner. An ideal interface would allow the user to have a conversation 
with the computer, just as one would have a conversation with a person. 

Animated conversational agents have been explored in the context of learning 
environments and help systems during the last decade [2], [3], [4], [11], [12], [14], 
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[19]. There is some evidence that AutoTutor, a tutoring system with an animated 
conversational agent, improves learning when college students learn about computer 
literacy or conceptual physics by holding a conversation with the computer tutor [11], 
[21]. However, it is still unsettled what aspects of a conversational agent might be 
effective, and under what conditions [2], [19], [23]. Is it the voice, the facial 
expressions, the responsiveness to the user, the gestures, the content of the messages, 
or some combination of these features? Whittaker (2003) has concluded that the 
voice is particularly effective in promoting learning and in engaging the user’s 
attention, but the other components of the agent may be effective under specific 
conditions that are not yet completely understood. 

One potential function of an animated conversational agent is to serve as a 
navigational guide to offer suggestions on how the user might use the interface in a 
productive way. This is an entirely different function from delivering the content of 
the material that would otherwise be read. The purpose of the present study was to 
investigate different types of conversational navigational guides that are available to 
adults when they use a new web site. Do these guides saving time for the user when 
the agents offer suggestions on what to do next? Does the user acquire more 
information because of the time that is allegedly saved? What are the perceptions of 
users toward conversational navigational guides? Do the like them, or are the 
suggestions irritating? It is widely acknowledged that the Microsoft’s Paperclip 
irritated many users because of its intrusiveness and the difficulty of getting rid of it. 
Perhaps a better designed, more conversationally appropriate, agent would be more 
appreciated by the user. 

We conducted a study on 155 college students who used a web facility in one of 
four navigational guide conditions: Full Guide (speech and face), Voice Guide, Print 
Guide, and No Guide. The web site was the Human Use Regulatory Affairs Advisor 
(HURAA), a web-based facility that provides help and training on the ethical use of 
human subjects in research, based on documents and regulations in United States 
Federal agencies [9]. The college students used HURAA to complete a number of 
training modules and document retrieval tasks. 



2 Different Types of Navigational Guides 

The Full Guide was a talking head with synthesized speech, facial expressions, and 
pointing gestures. The Agent told the user what to do next when the user first 
encountered a web page. For example, when the user entered the “Explore Issues” 
module, the Agent said, “Select the issue that you would like to explore.” The talking 
head also moved to direct the user’s attention to some point on the display. For 
example, the talking head looked down when he said “You may select one of the 
options below me.” The talking head told the user what each primary and secondary 
module was supposed to do, after the user rested the mouse pointer over a module 
link for more than 2 seconds. The Agent was designed to project an authoritative 
persona and to help the user navigate through the interface more quickly. Many 
novice users are lost and don’t know what to do next when they encounter a page. 
The Agent was designed to reduce this wasted time. 
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In order to directly test the influence of the Agent as a navigational guide, 
participants were randomly assigned to one of the following four conditions: 

Full Guide. There is the full talking head. 

Voice Guide. There is a voice that speaks, but no head. 

Print Guide. The guidance messages are printed at the location where the talking 
head normally is. 

No Guide. There are no messages of navigational guidance, either spoken or in 
print. 

If a navigational guide is important, then the completion of the various tasks should 
be poorer in the No Guide condition than the other three conditions: d < min {a, b, c}. 
When considering the three conditions with the guidance, there is a question of what 
medium is effective. If speech reigns supreme, then c < min{a,b}. This would be 
predicted by available research that has compared the impact of spoken versus printed 
text on comprehension and memory [2], [19], [23]. If print is superior, then the 
prediction would be that c > max{ a,b}. If the presence of the face provides a persona 
effect that improves interactivity [14], then the prediction is a > b. However, if the 
face is a distraction from the material in the main display, then the prediction is a < b. 



3 Human Use Regulatory Affairs Advisor (HURAA) 

HURAA is a web-based facility that provides help, training, and information retrieval 
on the ethical use of human subjects in research. The content of HURAA is derived 
from Federal agency documents and regulations, particularly the National Institutes of 
Health [20], the Department of Defense [6], [7], and particular branches of the US 
military. The targeted users of HURAA focus on fundamental ethical issues, but not 
the detailed procedures and paper work associated with gaining approval from 
Institutional Review Boards. 

The design of HURAA was guided by a number of broader objectives. The layout 
and design of the web facility incorporate available guidelines in human factors, 
human-computer interaction, and cognitive science [5], [17]. The architecture of the 
HURAA components needed to be conformant with the ADL standards for reusable 
instructional objects, as specified in the Sharable Content Objects Reference Model 
[22]. The primary objective of having these standards is to allow course content to 
be shared among different lesson planners, computer platforms, and institutions. 
HURAA was designed to optimize both learning and information transmission. Adult 
users are likely to have very little time, so it is important to optimize the speed and 
quality of learning in web-based distance learning environments. This requires 
careful consideration of the pacing of the information delivery, the selection of 
content, and design of the tasks to be performed. The web site was supposed to be 
engaging to the use, so there was persuasive multimedia intended to hook the user to 
continue on the website. Finally, HURAA incorporated some of the sophisticated 
pedagogical techniques that have been implemented in advanced learning 
environments with intelligent tutoring systems and animated conversational agents. 
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HURAA has a number of standard features of conventional web facilities and 
computer-based training, such as hypertext, multimedia, help modules, glossaries, 
archives, links to other sites, and page-turning didactic instruction. HURAA also has 
more intelligent features that allegedly promote deeper mastery of the material, such 
as lessons with case-based and explanation-based reasoning, document retrieval 
though natural language queries, animated conversational agents, and context- 
sensitive Frequently Asked Questions (called Point & Query, [10]). Additional 
details about HURAA can be found in Graesser, Hu et al., 2002. This paper directly 
focuses on some of the tasks users would complete with HURAA and what impact the 
4 different guides had on the completion of these tasks and the users’ perceptions of 
the learning environment. 



4 Materials and Procedure 

The experiment included three benchmark tasks that participants completed while 
interacting with HURAA. This was followed by a series of tests and surveys that 
were completed after they interacted with HURAA. We refer to these two phases as 
the HURAA acquisition phase and the post-HURAA test phase, respectively. The 
next section describes the modules and HURAA facilities that are directly relevant to 
the performance evaluation. The participants were 155 undergraduate students at the 
University of Memphis and Rhodes College who participated for course credit or for 
money ($20). 

4.1 HURAA Acquisition Phase 

Introduction. The Introduction Module is a multimedia movie that plays 
immediately after a new user has logged in. It is available for replay for users who 
want to see a repeat. The Introduction is intended to impress the user with the 
importance of protecting human subjects in research. It introduces the user to the 
basic concepts of the Common Rule [6], [20], of the Belmont Report’s coverage of 
beneficence, justice, and respect for persons, and of the Seven Critical Issues that 
must be scrutinized when evaluating any case [8]: Social and scientific value, 

accepted scientific principles, fair subject selection, informed consent, minimizing 
risks and maximizing benefits, independent review, and respect for subjects. The 
Introduction was prepared by an accomplished expert in radio and web-based 
entertainment industries, after rounds of feedback from a panel of DoD personnel. 

Lessons. This module has four lessons that teach the user about the Seven Critical 
Issues identified by Emmanuel et al. (2000) and how to apply them to particular cases 
that involve ethical abuses. This is a form of case-based reasoning [1], [15]. The first 
lesson presented the user with descriptions of the Seven Critical Issues, a summary of 
the Tuskegee Syphilis Study, and an explanation of how each of the Seven Critical 
Issues was violated in the Tuskegee study. The second lesson presented the user with 
a description of a study on post traumatic stress disorders. The user was then 
presented with the Seven Critical Issues and must decide, on a six-point scale, the 
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extent to which there potentially is a problem with each issue in that case. The six 
point scale is: 1 = Definitely not a problem, 2 = most likely not a problem, 3 = 
undecided, guess it’s not a problem, 4 = undecided, guess it’s a problem, 5 = most 
likely a problem, and 6 = definitely a problem. The user then received feedback 
comparing his/her responses with those of a panel of experts from the DoD, along 
with a brief explanation. Discrepancies between the learner’s decisions and the 
judgments of the experts were highlighted. Lesson 3 followed the same procedure 
as Lesson 2, except there was another case on a routine flight test with an 
experimental helmet. Lesson 4 presented two additional cases, following the same 
procedure. One was on helmet-mounted devices and the other on chemotherapy. 

Signal detection analyses were performed on the learner’s decisions as a measure 
of performance. There are four categories of decisions when signal detection analyses 
are applied. 

Hit (H). Both the learner and expert agree that an issue is potentially problematic 
for the particular case. 

Correct rejection (CR). Both the learner and expert are in agreement that an 
issue is not potentially problematic for a case. 

Miss (M). The expert believes there is a potential problem, but the learner does 
not. 

False alarm (FA). The learner believes there is a problem, but the expert 
believes there is no problem. 

The experts were 7 experts on research ethics in the military. A d’ score was also 
computed that assesses how well the learner can discriminate whether a case does 
versus do not have a problem with respect to an issue. A d’ score of 0 means the 
learner is not at all discriminating whereas the score increases to the extent that the 
user is progressively more discriminating (with an upper bound of about 4.0). 

Query Documents. This module allows the user to ask a natural language question 
(or description) and then generates an answer by retrieving high matching excerpts 
from various documents in the HURAA web site. For each document that the user 
selects, the highest matching paragraph from the document space is selected by the 
computational linguistics software and is displayed in a window. Beneath this 
window, the headings for the next four results appear. If the top choice is not the one 
that the user needs, s/he can click on the headings to read those excerpts. The search 
engine that was available to identify the optimal matches was latent semantic analysis 
[16], [13], 

In the search task, the participants were instructed to search the document space 
in order to find answers to 4 test questions. The participants recorded the answers in 
a test booklet. If the answer to a question was lengthy, they were instructed to write 
down the fetched document and section number where the answer was found. 
Performance was measured by retrieval time and the likelihood of retrieving the 
correct paragraph out of the large document space. If the natural language query 
facilities are useful, then there should be facilitation in the speed and likelihood of 
accessing the correct documents. 
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4.2 Tests in the Test Phase 

The test consisted of three parts: (1) Memory (a test on the important ideas from the 
Introduction and Lessons), (2) Issue comprehension (a test on the participant’s 
ability to identify potentially problematic issues in cases), and (3) Perception ratings 
(ratings on how the participants viewed the learning experiences). 

Memory for Important Ideas. This phase tests memory for the central, core ideas 
from the Introduction and Lesson material. These core concepts are those that all 
users should take away from the learning experience. Memory was assessed in three 
subtests: Free recall, cued recall, and the cloze task. The free recall test presented a 
series of concepts that the participants were asked to define or describe off of the top 
of their head. After finishing the free recall task, the cued recall test was administered 
on the next page. The cued recall test had more retrieval cues than the free recall test. 

The cloze procedure has the most retrieval cues. It took verbatim segments of the 
introductory text and left out key words, which the participant filled in. There were 
progressively more retrieval cues for content to be retrieved as one goes from free 
recall to the cloze task. 

Issue Comprehension. This test assessed how discriminating the participants were 
in identifying potentially problematic issues on two cases. The cases were selected 
systematically so that 6 of the issues were problematic in one and only one of the two 
cases; one of the issues was problematic in both cases so it was not scored. This test 
is functionally a transfer test from the case-based, explanation-based reasoning task in 
the HURAA acquisition phase. The participants simply read each case and rated the 
seven issues on the 6-point scale (as to whether issue I was problematic for case C. 

Perception Ratings. The participants gave ratings on their perceptions of the 
learning environments. The four rating scales that were included in all three 
experiments are presented below. The values on each rating scale were: 1 = disagree, 
2 = somewhat disagree, 3 = slightly disagree, 4 = slightly agree, 5 = somewhat agree, 
and 6 = agree. Examples are as follows: You learned a lot about human subjects 
protections.” and “It was easy to use and learn from these instructional materials.” 



5 Results and Discussion 

Table 1, on the last page, presents means and standard deviations of the dependent 
measures in the four experimental conditions. The most striking finding from the 
experiment is the lack of significant differences among conditions. In fact, there were 
no significant differences among the conditions for any of the 13 dependent measures 
in Table 1. This null result is incompatible with all of the above predictions. It 
should be emphasized that the sample size was quite large so the likelihood of a type 
II error was not high. 

The practical implication of the result is that the animated conversational agent did 
not facilitate learning, usage, and perceptions of the interface. In essence, the agent 
and the conversational guidance had no bang for the buck. Perhaps the web facility 
was designed extremely well, so well that a navigational guide was superfluous. The 
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navigational agent might prove to be more effect when the information on the screen 
is more complex, congested, and potentially confusing. Knowledge management 
systems often have complex information displays so the value of these agents may 
increase as a function of the complexity, ambiguity, and perplexity of the system. 
Perhaps there are special conditions when a navigational guide of some form will be 
helpful, whether it be print, voice, or a talking head. However, these precise 
conditions have yet to be discovered and precisely specified in the literature. 

It is appropriate to acknowledge that the results of the present study on agents as 
navigational guides does not generalize to other learning environments. Animated 
conversational agents have proven to be effective when they deliver information and 
learning material in monologues and tutorial dialogues [2], [19], [21], particularly 
when the test taps deep levels of comprehension. However, only a handful of 
empirical studies has systematically investigated the impact of these conversational 
agents on learning, so more research is definitely needed. One intriguing finding is 
that the amount of information that a person learns and remembers from a learning 
system is not significantly correlated with how much the learner likes the system [18]. 
Simply put, learning is unrelated to liking. It this result is accurate, then it is not 
sufficient to simply ask users and individuals in focus groups what they like or do not 
like about agents and navigational guides. There also needs to be a serious, deep, 
research arm that goes beyond intuitions of users, designers, and managers. 
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Table 1 . Means (and Standard Deviations) for Dependent Measures. 



DEPENDENT 

MEASURES 


Full 

Guide 


Voice 

Guide 


Print 

Guide 


No 

Guide 


Number of participants 


40 


39 


38 


38 


Memory for Core Concents 
Free recall proportion .45 (.21) 


.43 (.20) 


.42 (.21) 


.44 (.20) 


Cued Recall proportion 


.51 (.24) 


.50 (.23) 


.45 (.26) 


.53(.23) 


Cloze recall proportion 


.44 (.15) 


.42 (.18) 


.39 (.17) 


.47 (.19) 


Introduction study time 
(minutes) 


6.6 (3.3) 


9.3 (19.0) 


7.4 (4.7) 


10.7(19.8) 


Problematic Issue Identification 
Hit proportion .58 (.10) 


.58 (.10) 


.56 (.12) 


.60(. 13) 


False alarm proportion 


.40 (.32) 


.39 (.30) 


.38 (.34) 


.38(.31) 


d ’ score (discrimination) 


.30 (.46) 


.31 (.36) 


.14 (.75) 


.27 (.74) 


Task completion time 
(minutes) 


23.7 (6.2) 


23.7(8.1) 


20.8(5.2) 


22.9(4.1) 


Search for Information 

Correct document retrieval ,55(.25) 

(Proportion) 


.50 (.49) 


.47(.23) 


.52(,25) 


Search time 
(minutes) 


27.1(11.4) 


24.4(9.6) 


23.5(9.2) 


24.8(7.9) 


Perception ratings 
Amount learned 


4.75(1.06) 


4.62(1.16) 


4.61(1.08) 


4.53(1.48) 


Interest 


3.85(1.63) 


4.08(1.46) 


4.11(1.41) 


3.79(1.82) 


Enjoyment 


3.50(1.47) 


3.46(1.48) 


3.58(1.41) 


2.89(1.47) 


Ease of learning 


4.13(1.40) 


3.95(1.45) 


3.71(1.71) 


3.61(1.57) 
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Abstract. In this paper we present an analysis and modeling case study for 
agent mediated knowledge management in educational environments: 
Help&Learn, an agent-based peer-to-peer helpdesk system to support extra- 
class interactions among students and teachers. Help&Learn expands the 
student’s possibility of solving problems, getting involved in a cooperative 
learning experience that transcends the limits of classrooms. To model 
Help&Learn, we have used Agent-Object-Relationship Modeling Language 
(AORML), an UML extension for agent-oriented modeling. The aim of this 
research is two-fold. On the one hand, we aim at exploring Help&Learn’s 
potential to support collaborative learning, discussing its knowledge 
management strategy. On the other hand, we aim at showing the expressive 
power and the modeling strengths of AORML. 



1 Introduction 

As we enter the new millennium, we realize a shift in the business model from the old 
static model, based on hierarchic organizations, towards a dynamic model, built on 
top of continuously changing and knowledge-based organizations. This new business 
model requires that the twenty-first century professionals have a set of characteristics, 
such as creativity, flexibility and ability to cooperate and work in teams. The 
hierarchical educational model is not appropriate to educate these professionals [8]. 
Methods based on collaboration, viewing students as consumers but also as providers 
of knowledge [15] can lead to better results because they aim at motivating active 
participation of the individual in the learning process, which often results in the 
development of creativity and critical thinking [8]. 

Knowledge Management (KM) deals with the creation, integration and use of 
knowledge [6]. These processes are directly related and can be very beneficial to 
collaborative learning. In fact, we can say KM systems support some kind of 
unintentional learning, since users can learn, while sharing knowledge. Although the 
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benefits of KM for education have been acknowledged many times [13], its full 
exploration in educational settings is still to be seen. In this context, it is important to 
deliver knowledge in a personalized way, with respect to user’s preferences regarding 
content and presentation. Another issue here is dealing with logically and physically 
dispersed actors and knowledge sources. The system must provide flexible ways to 
access these sources, helping the user to find the required knowledge in the right time 
[6]. Agent-mediated KM comes as a solution in such dynamic environments [5]. 
Agents can exhibit this type of flexible behavior, providing knowledge both 
“reactively”, on user request, and “pro-actively”, anticipating the user’s knowledge 
needs. They can also serve as personal assistants, maintaining the user’s profile and 
preferences. 

This paper presents Help&Learn (H&L), a Web-based peer-to-peer helpdesk 
system to support extra-class discussions among students and teachers. In this 
environment, knowledge emerges as a result of the ongoing collaboration process. As 
the peer-to-peer architecture suggests, the relationship between teachers and students 
is non-hierarchical. Instead, all users are peers who collaborate in knowledge creation, 
integration and dissemination. In H&L, personal assistants are used to maintain a user 
profile and knowledge base, and other software agents are responsible for finding the 
best peer to answer to a question, and for managing knowledge resources. 

From a software engineering perspective, the analysis and design of the distributed 
processes involved in knowledge management become increasingly sophisticated and 
require an agent-oriented approach, such as the Agent-Object-Relationship Modeling 
Language (AORML) [16], an extension of UML to model agent-oriented information 
systems. The strengths of AORML with respect to KM systems are: 1) it considers the 
organizations and actors of a domain as agents in the modeling process. In this way, it 
allows to model business processes on the basis of the interactions between (human 
and artificial) agents working on behalf of their organizations. Related work is 
mentioned in [5]. Although norms and contracts are not directly supported by 
AORML, it provides deontic modeling constructs such as commitments and claims 
with respect to external agents, and obligations and rights with respect to internal 
agents. 2) the fact that ‘mentalistic’ concepts of agents, such as beliefs and 
commitments, are explicitly considered in the system model, supports the software 
engineer to reason about and to model the behavior of agents, both internally and in 
interaction with other agents of the system; 3) it captures the behavior of agents with 
the help of rules. Besides these strengths, since AORML is an extension of UML, 
preserving its principles and concepts, it is an accessible language, and it is likely to 
face less resistance for industrial acceptance and use. 

The aim of this research is two-fold. On the one hand, we aim at exploring H&L’s 
potential to support collaborative learning, discussing its KM strategy. On the other 
hand, we show that AORML is an appropriate language to model such a system, as 
well as other KM environments. In section 2, KM is described in connection with the 
educational context. Section 3 presents a description of H&L, introducing the main 
problems and activities in focus. Section 4 introduces AORML and its modeling 
constructs, which are then applied in section 5, presenting part of the Help&Learn 
system’s modeling. Section 6 acknowledges some work related to Help&Learn and to 
AORML. Finally, some directions for future work and conclusions are presented in 
sections 7 and 8, respectively. 
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2 Knowledge Management in Education 

Collaborative Learning mediated by network-based environments have been the focus 
of many recent research initiatives and experiments [8,12], especially within the 
CSCL and the E-learning communities. The need for a different kind of learning 
approach has been also noted within the KM literature, like in [6]: 

“The traditional education paradigm is inappropriate for studying the types of 
open-ended and multidisciplinary problems that are most pressing to our society. 
These problems, which typically involve a combination of social and technological 
issues, require a new paradigm of education and learning skills, including self- 
directed learning, active collaboration, and consideration of multiple perspectives.” 

Mag ad a and Tijiboy (1998) [8] consider three essential elements for collaborative 
learning to succeed in network-based environments: a) cooperative posture, which 
involves: non-hierarchical relationship between the participants, collaboration, 
constant negotiation, open-mindedness, etc.; b) collaborative technological 
infrastructure; and c) a non-hierarchical method, i.e. it is very important that all the 
participants get involved in the constant organization and re-organization of the 
environment dynamics (meaning the establishment of goals, norms, roles, priorities of 
tasks, etc.). 

Especially focused in b), this work is based on the assumption that KM can be 
generally beneficial for learning [13]. KM can, for instance, motivate learners to be 
more active and to collaborate. While feeding a KM system, the users need to create 
artifacts, externalizing their knowledge, in order to make it available for other users 
(user-based approach for knowledge creation, similar to the one adopted in [6]). This 
process of extemalization is an important step for learning. Supporting this idea, 
Constructionist learning theories emphasize the importance for the learner to produce 
something concrete, which he can share with his peers [3]. In other words, 
externalizing knowledge by means of a sharable artifact will help the learner to 
perform synthesis and learn, and at the same time it may motivate him for peer 
collaboration. 

The knowledge resources exchanged in a learning environment cannot be much 
differentiated from those exchanged for other purposes. In this context: i) there is a 
share of physical resources, such as: books, articles, and other educational artifacts; ii) 
with the growing use of information technology and the Internet in these settings, 
there are plenty of electronic documents, references, and web links; and, finally, iii) 
there is also tacit knowledge [6], i.e. knowledge that is contained in people’s minds 
and that is usually informally exchanged among them by different means, for 
instance, in person, through messages, or via Internet communication tools integrated 
in virtual learning environments [12]. All these forms of knowledge need to be 
properly integrated and managed in order to bring about positive changes in the 
teaching/learning process. 

Exemplifying the common difficulties of this context, we mention the fact that all 
these resources are distributed among people and that it is not easy to find out who 
has the right piece of information, knowledge or advice. The nature of these problems 
suggest that KM systems (KMSs) can be highly recommendable for learning settings. 
In addition to that, software agents’ specific characteristics turn them into promising 
candidates in providing a KMS solution [5]. These agents can be used both as a 
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metaphor to model the domain in which the system will be deployed, and as software 
components to develop the actual KMS. 

Targeting the above highlighted problems, we propose H&L to support the 
organization and sharing of distributed knowledge. The specifics about the KM 
strategies applied in H&L are presented in the next session. 

3 Help&Learn: A Peer-to-Peer Architecture for Knowledge 
Management in Learning Settings 

Peer-to-peer introduces a set of concepts that takes a human centered view of 
knowledge as residing not just in people’s minds but also in the interaction between 
people and between people and documents [15]. 

H&L expands the student’s possibility of solving their doubts, getting involved in a 
cooperative learning experience that transcends the limits of classrooms. By 
collaborating with other peers, the students learn with the doubts of others, besides 
developing cognitive abilities, such as to state clearly their doubts and thoughts; to 
interpret questions; to mediate discussions; and to solve problems. In this open 
context, other interested parties may join the learning community, such as business 
employees and online organizations. They bring different perspectives to the 
discussions, making the cooperation richer. Figure 1 shows the peer-to-peer 
architecture of the proposed scenario. 



Fig. 1 . A teacher, a student, and employees of a company, interacting to ask and answer 
questions in the proposed peer-to-peer environment 

We use the metaphor of a helpdesk, where somebody asks for help (the helpee) and 
somebody provides the needed help (the helper). Each peer in the network is seen as a 
source of knowledge. The agents of the system are responsible for managing the 
exchanges between these sources. This includes: a) handling a peer request for help 
and delivering help in a personalized way; b) finding the best peer to answer to a help 
request; and c) searching through previously asked questions/answers [12]. 

In H&L, knowledge is created and integrated in use-time, including users 
participation in these processes, and not in design time with the help of a knowledge 
engineer. This model has many advantages, such as: avoiding that knowledge artifacts 
become obsolete, for being dependant on the knowledge engineering; and motivating 
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the users of the system to engage in collaboration and learning, while creating and 
sharing the artifact [6]. The knowledge in H&L is exchanged by the system peers in 
the form of Helpltems. These Helpltems can be what-is or how-to-do explanations, 
bibliographic or Web references, electronic documents, or even hardcopies, 
depending on the peers setting (e.g. inside a school or a company, hardcopies can be 
exchanged in addition to electronic copies). 

The users are not required to perform knowledge formalization. The exchanged 
questions and answers are expressed and stored in natural language. Besides 
mediating this exchange of help, the system agents are responsible for searching 
through previously asked questions and answers to provide the users with suitable 
help. 

The quality of knowledge artifacts is an important issue in KMSs [6]. In H&L, this 
is measured by the peers themselves. The help provided is annotated by the helpee, 
and this information is shared among the agents of the system, to be considered in 
hit ure helper indication. 

As in a typical peer-to-peer application [15], a key issue here is finding the best 
peer to satisfy a certain help request. A helper is selected if she can fulfill a help 
request, by providing the helpee with appropriate Helpltems. Besides expertise, the 
time and availability of the peer are also considered for the best helper indication. As 
an example, a teacher may know the answer to a student’s question but she may have 
less time than an advanced student to spend on it. 

A common problem in KM settings is motivating the users of the system to use it 
in its full potential [6,7]. The peer motivation to participate in discussions and answer 
to helpees’ questions can be given by a sense of belonging to a learning community, 
or by the desire of having a good social status [7]. However, this motivation can also 
be caused by external factors, like teacher’s reinforcements or an external grading 
system. 



4 Agent-Object-Relationship Modeling 

The Agent-Object-Relationship (AOR) modeling approach [16] is based on an 
ontological distinction between active and passive entities, that is, between agents and 
objects. This helps to capture the semantics of complex processes, such as the one that 
involves teachers and students, owners and employees of a company, and other actors 
involved in a KM environment. The agent metaphor subsumes both artificial and 
natural agents. This way, the users of the information system are included and also 
considered as agents in AOR modeling. 

Intuitively, some connections can already be identified between the knowledge 
artifacts in a KMS and objects, and between the KMS users and human agents. The 
KMS itself can also be composed of multiple software agents, which perform 
different tasks, accomplishing various goals, in order to mediate the processes of 
knowledge creation, integration and sharing. These agents can be identified and 
modeled with the aid of AORML. 

AOR distinguishes between agents and objects according to these two main points: 
1) while the state of an object in OO programming has no generic structure, the state 
of an agent has a ‘mentalistic’ structure: it consists of mental components such as 
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beliefs and commitments. 2) while messages in object-oriented programming are 
coded in an application-specific ad-hoc manner, a message in Agent-Oriented 
Programming is coded as a ‘speech act’ according to a standard agent communication 
language that is application-independent. 

In AORML, an entity is either an agent, an event, an action, a claim, a 
commitment, or an ordinary object. Agents and objects form, respectively, the active 
and passive entities, while actions and events are the dynamic entities of the system 
model. Commitments and claims establish a special type of relationship between 
agents. These concepts are fundamental components of social interaction processes 
and can explicitly help to achieve coherent behavior when these processes are semi or 
hilly automated. 

Only agents can communicate, perceive, act, make commitments and satisfy 
claims. Ordinary objects are passive entities with no such capabilities. Besides human 
and artificial agents, AOR also models institutional agents. Institutional agents are 
usually composed of a number of human, artificial, or other institutional agents that 
act on its behalf. Organizations, such as companies, government institutions and 
universities are modeled as institutional agents, allowing to model the rights and 
duties of their internal agents. 

There are two basic types of AOR models: external and internal models. An 
external AOR model adopts the perspective of an external observer who is looking at 
the (prototypical) agents and their interactions in the problem domain under 
consideration. In an internal AOR model, we adopt the internal (first-person) view of 
a particular agent to be modeled. 

This paper is focused on the exemplification of external AOR models, which 
provide the means for an analysis of the application domain. Typically, these models 
have a focus, that is an agent, or a group of agents, for which we would like to 
develop a state and behavior model. Figure 2 shows the elements of an external AOR 
model, in which the language notation can be seen. 

Object types belong to one or several agents (or agent types). They define 
containers for beliefs. If an object type belongs exclusively to one agent or agent type, 
the corresponding rectangle is drawn inside this agent (type) rectangle. If an object 
type represents beliefs that are shared among two or more agents (or agent types), the 
object type rectangle is connected with the respective agent (type) rectangles by 
means of an UML aggregation connector. 

As it can be seen in Figure 2, there is a distinction between a communicative action 
event (or a message) and a non-communicative action event. Also, AOR 
distinguishes between action events and non-action events. The figure also shows that 
a commitment/claim is usually followed by the action event that fulfills that 
commitment (or satisfies that claim). 

An external model may comprise one or more of the following diagrams: 

• Agent Diagrams (ADs), depicting the agent types of the domain, certain 
relevant object types, and the relationship among them. An AD is similar to a 
UML class diagram, but it also contains the domain’s artificial, human and 
institutional agents. 

• Interaction Frame Diagrams (IFDs), depicting the action event types and 
commitment/claim types that determine the possible interactions between two 
agent types (or instances). 
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Fig. 2. The core elements of AOR external models 



• Interaction Sequence Diagrams (ISDs), depicting prototypical instances of 
interaction processes. 

• Interaction Pattern Diagrams (IPDs), focusing on general interaction patterns 
expressed by means of a set of reaction rules defining an interaction process 
type. Reaction rules are the chosen component by AOR to show the agent’s 
reactive behavior and it can be represented both graphically and textually. 

These diagrams will be exemplified in the following section. For further reference, 
we refer to [16] and to the AOR website: http://aor.rezearch.info/. 



5 Help&Learn Modeling 

AORML can be used throughout the whole development cycle of a system. In this 
paper, we will focus on the analysis phase, in which we applied AOR external 
models. Figure 3 depicts the agent diagram, which includes all human, artificial and 
institutional agents (distinguished by UML stereotypes) involved in the helpdesk, and 
their relationships. Note that this diagram is very similar to the UML class diagram, 
showing the system’s classes and relationships between them. For clarity purposes, 
the attributes of agents and objects are omitted in this diagram. However, they can be 
expressed following the traditional UML syntax. 

As the above diagram shows, H&L brings together students, teachers and general 
business professionals as peers of a learning community. Below, we give a brief 
description of each artificial agent of the system. 

Help&Learn Infrastructure Server (IS). This agent addresses the management of 
the H&L system itself. It provides the other artificial agents of the system, as well as 
periodic updates. 

Peer Assistant (PA). In order to start participating on discussions in the system, a 
Person downloads the Peer Assistant (PA) from the H&L IS. This way, this Person 
becomes one of the system Peers, being able to act both as a helpee and as a helper for 
other Peers. 
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Fig. 3. Helpdesk System Agent Diagram 



Directory Server (DS) and Broker (B). Every time the PA goes online, it registers 
with the Directory Server (DS), becoming available to answer help requests. When 
doing this, the PA will provide the DS with a minimal Peer profile, indicating what 
topics can be answered by him. On the other hand, the Broker creates his own Peer 
profile by contacting the PAs and also by applying data mining techniques on the DS 
profiles, in order to make rankings and classifications. The Broker ranks the Peers 
based on expertise, availability and reliability and it classifies them based on interests. 
This way, when queried by the PAs, it can provide information on the most 
appropriate Peers to answer a certain help request. The DS also maintains a repository 
of previously provided explanations, along with their respective request (typically, a 
question). This way, the PA consults this agent every time a question is forwarded to 
it by a helpee, to check whether or not this question has been already answered. If so, 
the answer is immediately recovered to the helpee; otherwise, the PA consults the 
Broker for a best helper indication. In this repository, Information Retrieval 
Techniques are used in order to group similar questions and aid the retrieval of the 
relevant ones, as well as to support the creation of an automatic FAQ, according to the 
proposals of a previous work [12]. 

SIG Assistant. Special Interest Groups (SIGs) are also allowed to participate in the 
system (this is indicated by the inclusion of the institutional agent SIG in the agent 
diagram of Fig. 3). These SIGs usually pre-exist the system, but can also be created 
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by suggestion of the Broker. It is not necessary that all the members of a SIG are 
Peers, only one member is enough (note this, again in the agent diagram, which 
generalizes a SIG Member as a Person, instead of relating it with the Peer class). The 
Broker has a representation of the SIGs and can also suggest that a PA contacts one of 
the SIG Assistants in order to ask the SIG for help. The SIG Assistant broadcasts the 
message to all members of the SIG. Then, the answers are sent back to the PA. 
Today, there are many SIGs advertised in the Web, specialized in several different 
areas. By introducing them to the helpdesk system, we hope to broaden their 
interaction scope, at the same time that we give the opportunity for other Peers to 
have their help request answered by an expert on the topic. 

Resource Manager (RM). The Resource Manager brings to the system existing 
knowledge bases, which can be databases, document repositories etc. This way, 
Helpltems that are not owned by any of the system Peers can also be considered and 
consulted by the PAs. These knowledge bases can be consulted through keyword or 
query search. A System Peer does not directly contact a RM. Instead, this is done 
through the PA. In the case of a query search, the Peer uses an interface, based on 
established query languages such as SQL, XML-Query, or RDF-Query, which will 
then be translated by the contacted RM to the query language of the specific 
knowledge base. These agents are typically downloaded by the owners of existing 
knowledge bases, who will create the translation specification. 



5.1 Interactions in Help&Learn 

The next step after defining the agents in the system is to model their interactions 
using Interaction Sequence Diagrams (ISDs) for concrete examples. In H&L, a Peer 
can request explanation, or for a document (reference, electronic copy or hardcopy). 
For reasons of lack of space, only the first one is exemplified in this paper. 

A prototypical interaction sequence triggered when a Peer issues a request for an 
explanation is shown on Figures 4 and 5. Such sequence should generally be 
maintained in the same ISD, integrating the whole process. This is especially useful 
for automatic code generation. Here, we chose to divide the sequence in two phases in 
order to facilitate our exploration of the modeling language specifics. Moreover, this 
way the general understanding of the interaction sequence may be eased. 

Figure 4 shows the Peer request and the best helper indication by the Broker. Here, 
Anna, a system Peer, issues a request for help to her PA, asking “what is p2p?”. The 
PA attempts first to find out if this question has already been asked, by querying the 
DS maintaining the Explanation Case (see Fig. 3). Since this question is asked for the 
first time, the PA cannot provide a direct answer and asks the Broker to find the best 
helper to answer this question. The Broker returns a ranked list of possible Peers for 
the PA to select. In our example, this list contains only one indication: Mark. 

Having the Broker’s indication, the PA will then try to get the Helpltem that 
fulfills its user’s request. This is depicted in Fig. 5. It starts with Anna’s PA 
contacting Mark’s PA with the request for help. Mark’s PA replies with an 
acknowledge message, confirming it received the request. In this moment, a 
commitment is established from Mark’s PA towards Anna’s PA, fixing that the first 
will try to get help (from its peers) to answer to the latter’s request. 
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Fig. 4. Interaction Sequence Diagram, showing a help request being issued by Anna, a H&L 
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Fig. 5. Interaction Sequence Diagram, showing how a PA deals with the request for an 
explanation, on behalf of its user 



This commitment is also represented in the ISD of Fig. 5. It is created by the 
acknowledge message (dashed arrow along with a “C”, for “Create”) and it has two 
arguments, a provideHelp and a noHelpAvailable message. These two messages 
compose an Or-Split (diamond containing an “x”), which represent the possible 
outcomes if the commitment is fulfilled. If any other possibility occurred, it would 
mean the commitment had been broken. Proceeding in the ISD, we will see this is not 
the case in this example. Mark’s PA forwards the request to Mark, who provides the 
following answer: “p2p is a distributed technology...”. This message is then 
forwarded to Anna’s PA (note an arrow from this message to the commitment, 
indicating its fulfillment). At last, the help (i.e. the explanation) gets to its destination: 
Anna. 
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Fig. 6. Interaction Sequence Diagram, showing the interaction process when a request for a 
keyword search is issued by a peer 



The use of commitments supports situations in which the communication between 
two agents is asynchronous, as in this case. Mark’s PA confirms it is going to provide 
the help. However, Anna’s PA knows this can take some time, depending on Mark’s 
availability and willingness to respond. Commitments are also good constructs to treat 
agent’s autonomy. If it were useful, we could represent, for example, a commitment 
between Mark and its Peer, establishing that Mark commits to answer to the help 
request. At first sight, this does not seem very natural, since Mark is a human and, as 
such, has full autonomy over the system. In other cases, though, dealing with life- 
threatening situations and, of course, with artificial agents, this can be rather a good 
approach. Furthermore, commitments can be used as triggers for exception handling. 
For example, what should Anna’s PA do in case Mark’s PA does not meet its 
commitment? In H&L, this agent tries to find another Peer to answer to the help 
request. 

Besides requesting an explanation, a Peer can also ask for documents, providing its 
PA with a list of keywords. Figure 6 depicts the interactions between Help&Learn’s 
agents, when a request of this type is issued. 

In the ISD of Fig 6, Anna requests its PA for documents about “peer-to-peer”. The 
PA asks the Broker who are the best helpers to answer to this request. The Broker 
returns a list of ranked Peers to answer to the request. In this case, this list contain two 
Peers: Mark and Joanna. Next, Anna’s PA contacts the PA of both Peers, forwarding 
the request for keyword search to them. The PAs search through the documents 
owned by their Peers, returning the available documents to Anna’s PA. Finally, 
Anna’s PA forwards the documents to Anna. 
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Fig. 7. Interaction Pattern Diagram, showing the PA’s internal behavior when receiving a 
Helpltem on behalf of its user 

Note that the sequence shown in Fig. 4, 5 and 6 depicts just one of many possible 
interactions. The software engineer should make a number of ISDs in order to capture 
various interaction perspectives. This way he can afterwards generalize the 
interactions in Interaction Frame Diagrams (IFDs), which depicts the action event 
types and the commitment/claim types that determine the possible interactions 
between two agent types (or instances) [16]. 

Further, the interactions can be detailed in Interaction Patterns Diagrams (IPDs). 
These diagrams depict general interaction patterns expressed by means of a set of 
reaction rides, defining an interaction process type. Reaction rides are the chosen 
component by AOR to show the agent’s reactive behavior, and they can be 
represented both graphically and textually. Figure 7 depicts an example of this type of 
diagram. 

The IPD of Fig. 7 depicts only the two agents involved in this specific process: the 
PA and the DS. When the DS receives a checklfExistingExplanation message, it 
immediately reacts, checking if the sent Question can be found in the Explanation 
Case (i.e. if the question is similar enough to one or more previously asked ones, 
according to DS’s internal algorithms). 

In the affirmative case, the DS sends back the respective answer to the PA. 
Otherwise, it simply “says no”. This is modeled with the rule Rl, which is textually 
represented (See Table 1). 

After the external model has been completed, the modeling can proceed to the 
design stage, in which, for each type of agent system to be designed, the external 
model is internalized according to the perspective of the respective agent, and 
subsequently further refined. For instance, an action event, if created by the agent to 
be designed, is turned into an action, while it is turned into an event if it is perceived 
by it. Using such an internal perspective and the corresponding indexical terms (such 
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as actions and outgoing messages versus events and incoming messages), leads to a 
natural terminology for designing and implementing agents. H&L internal models 
will be the subject of future publications. 



Table 1 . Textual Representation of the R1 Reaction Rule 



ON 


Event 


RECEIVE checkltExistingExplanation (?Question) 
FROM ?PeerAssistant 


IF 


Condition 


ExplanationCase (?Question,?Answer) 


THEN 


Action 


SEND replylfExistingExplanation (?Answer) 
TO ?PeerAssistant 


ELSE 


Action 


SEND replylfExistingExplanation (?Answer=“no”) 
TO ?PeerAssistant 



6 Related Work 

Regarding Help&Learn, it is important to mention other initiatives on developing 
peer-to-peer architectures to support knowledge sharing. One of these initiatives is the 
EDUTELLA project [10], which aims at providing a peer-to-peer networking 
infrastructure to support the exchange of educational material. In order to accomplish 
this, peers can make their documents available in the network, specifying metadata 
information as a set of RDF statements. 

Bonifacio et al. [2] have developed KEx, a peer-to-peer system to mediate 
distributed knowledge management. KEx allows each individual or community of 
users to build their own knowledge space within a network of autonomous peers. 
Each peer can make documents locally available, along with their context, i.e. a 
semantic representation of the documents’ content. When searching documents from 
other peers, a set of protocols of meaning negotiation are used to achieve semantic 
coordination between the different representations (contexts) of each peer. Both 
EDUTELLA and KEx are specifically concerned with the exchange of documents and 
do not address peer collaboration through the exchange of messages, which is one of 
the targets of H&L. 

On the other hand, the work proposed by Vassileva [14] proposes a peer-to-peer 
system to support the exchange of messages between students. A student needing help 
can request it through his agent, which finds other students who are currently online 
and have expertise in the area related to the question. As in H&L, there is a 
centralized matchmaker service, which maintains models of the users competences 
and matches them to the help-requests. This work is particularly concerned with user 
motivation to collaborate. Thus, the system rewards users who contribute to the 
community, by providing them with a better quality of service. 

Concerning agent-oriented modeling, we should mention AUML [11] and 
Message/UML [4] since both propose UML extensions to model agent-based systems. 
AUML has especially extended UML sequence diagrams to model interaction 
protocols involving agent roles. Message/UML proposes 5 views: Organization, 
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Goal/Task, Agent/Role, Interaction and Domain views, each of them modeling a 
specific aspect of the multi-agent system. In comparison with AORML, these two 
approaches do not target domain modeling, being both design-oriented. Besides, both 
of them lack the ‘mentalistic’ concepts (commitments, claims and beliefs) presented 
by AORML. 

It is also important to acknowledge the efforts of Molani et al. [9] in the direction 
of providing a system analysis and design methodology specific for the Knowledge 
Management domain. They claim that, in order to develop effective KM solutions, it 
is necessary to analyze the intentional dimension of the organizational setting, i.e. the 
interests, intents, and strategic relationships among the actors of the organization. 
Like AORML, they take an agent-oriented approach to model the domain. The major 
difference when compared to AORML is the adopted i* framework. Instead of the 
AORML constructs of agents, objects, relationships, messages, commitments, etc., 
this framework models the organization as a set of actors, goals, ‘soft goals’, 
dependencies, tasks and resources. 



7 A Few Directions for Future Work 

In the process of defining H&L’s architecture, guided by the use of AORML, we have 
elicited many questions, whose answers will be important for the future development 
of the system. For instance, we intend to address issues related to: a) the structuring of 
the questions and respective answers, present in the Explanation Case (EC); b) the 
organization of the personal knowledge assets owned by each peer; c) and the 
management of Helpltems by the Resource Managers (RMs). Targeting a), we aim at 
investigating, for instance, how the techniques applied in a previous work [12] can be 
enhanced (and, perhaps, new techniques applied) in order to provide suitable 
structuring and retrieval of the EC’s questions and answers (refer to IPD of Fig. 6). 
This investigation, along with some extra studies, can indicate possibilities for 
addressing b) and c) as well. Inspired by current research on the Semantic Web [1], 
we intend to incorporate Ontologies into the H&L architecture. A preliminary study 
suggests that these ontologies can be aimed at making knowledge explicit, supporting 
interaction among the system peers. Another possibility is applying contexts to 
organize the peer’s Helpltems, as suggested in [2], where context is defined as an 
explicit semantic schema over a body of local knowledge. 

Another important research focus related to Help&Learn is Personalization, i.e. 
issues regarding how a PA should balance reactiveness, acting on user request, and 
pro-activeness, delivering content to the user. In this context, two important questions 
appear to be important: a) how much should be delegated to the PA during the search 
for help? The precision of the peer help request should be balanced with the PA’s 
responsibility to search for the appropriate Helpltem; and b) how should the PAs 
balance what they know about the peers and what they disclose to the DS and the 
Broker? As mentioned in H&L’s modeling section, each of these agents (PA, DS and 
Broker) have different representations of the peer, as they have different goals in the 
system. Thus, the disclosure of information should be appropriate both for the 
achievement of the agent’s goals and for guaranteeing the right level of user’s 
privacy. 
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8 Conclusions 

In this paper, we have described our work in progress with Help&Leam, a peer-to- 
peer agent-oriented architecture, aimed at providing its users with a rich environment 
for both collaborative and individual use of knowledge. In order to do so, results on 
collaborative learning [8,12] and KM related research [5,6,7] have been used in the 
conceptualization and modeling of H&L. We take an agent-oriented perspective on 
system architecture, where agents play a crucial role in supporting the effectiveness, 
flexibility and personalization of the whole process. Following, we apply an agent- 
oriented modeling approach (AORML), which proved to be an effective modeling 
language for our purposes. On the one hand, AOR models have led us thoroughly to 
this specification of H&L, aiding us on a system’s requirements specification, 
analysis and initial design cycles. On the other hand, this experimentation has also 
provided us with feedback on how AORML can be extended, adding new constructs 
to facilitate agent-oriented modeling. Finally, this work has led us to the elicitation of 
relevant research focuses and questions (presented in section 7), which form our 
research agenda for the future. 
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Abstract. Distributed knowledge management systems (DKMS) have been sug- 
gested to meet the requirements of today’s knowledge management. Peer-to-peer 
systems offer technical foundations for such distributed systems. To estimate 
the value of P2P-based knowledge management evaluation criteria that measure 
the performance of such DKMS are required. We suggest a concise framework 
for evaluation of such systems within different usage scenarios. Our approach 
is based on standard measures from the information retrieval and the databases 
community. These measures serve as input to a general evaluation function which 
is used to measure the efficiency of P2P-based KM systems. We describe test sce- 
narios as well as the simulation software and data sets that can be used for that 
purpose. 



1 Introduction 

Many enterprizes have spent large amounts of money to implement centralized knowl- 
edge management systems to keep in business in today’s knowledge-based economy, 
often with little success. Among others [1 ] suggest a distributed approach to knowledge 
management which better fits organizations and their employees. 

Participants can maintain individual views of the world, while easily sharing knowl- 
edge in ways such that administration efforts are low. The distributed environment is 
implemented by a peer-to-peer network (which is basically equivalent to a system of 
distributed agents) without any centralized servers. P2P systems have been used for 
collaborative working or hie sharing, but knowledge sharing applications herein mostly 
relied on keyword search and very basic structures. Modern (centralized) knowledge 
management systems are based on ontologies which have shown to be the right an- 
swer for problems in knowledge modelling and representation [2], An ontology [3] is 
a shared specification of a conceptualization. Through their structure ontologies allow 
answering a wider range of queries than standard representations do. Semantic Web 
technologies can augment this [4], Current research projects 1 attempt to exploit the 
best of the two worlds. Specifically, we want to do semantic information retrieval in a 
peer-to-peer environment - resulting in a Distributed Knowledge Management System 
(DKMS). 

1 SWAP (swap.semanticweb.org) and Edutella (edutella.jxta.org) 



L. van Elst, V. Dignum, and A. Abecker (Eds.): AMKM 2003, LNAI 2926, pp. 73-88, 2003. 
(c) Springer- Verlag Berlin Heidelberg 2003 
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In this work, we suggest a framework for evaluation of such distributed knowledge- 
based systems. Only through a thorough evaluation we can gain the insights to further 
develop and enhance ideas and systems. Evaluation is either possible through user- 
based evaluation or system evaluation. User-based evaluation measures the users satis- 
faction with a system, system evaluation compares different systems with respect to a 
given measure. 

While system evaluation permits a more objective confrontation of different ap- 
proaches, the correlation of the results with the final user satisfaction is not always 
clear. However, user-based evaluation is expensive, time-consuming and it is difficult 
to eliminate the noise which is due to user experience, user interface and other human 
specific factors. 

Tools developed within the cited projects focus on the technical aspects of knowl- 
edge management. Thus, we use the system evaluation approach. The need for a stan- 
dard evaluation mechanism is also recognized in other papers in this book e.g. [5], 
Techniques from traditional Information Retrieval [6] and networking research [7] will 
have to be combined with ontology specific measures to gain meaningful results. 

This paper is structured as follows: in the first section we will introduce a set of 
use cases to illustrate the different dimensions of the problem at hand. A definition of 
evaluation measures will be given in the second section. In the following section we 
want to give a notion of tools which can be used. A part on generation of test data for 
these simulations follows in the succeeding section. Further we give a practical view 
describing the test parameters. Related and future work conclude this paper. 



2 Scenarios 

The field of possible applications for peer-to-peer computing is huge. Currently running 
systems include file sharing (e.g. Gnutella 2 [8]), collaboration (e.g. Groove 3 ), comput- 
ing (e.g. Seti@home 4 ), knowledge management [9], to name but a few. For this reason 
we provide some scenarios for DKMS we examine. Various conclusions for our ontol- 
ogy based KMS will be drawn from these scenarios. 

2.1 Application Scenarios 

This part will describe some real life situations in order to find characteristics which 
influence the distribution of information within the examined scenarios (Figure 1). The 
purpose of the scenarios is not to give a detailed impression of the entire IT-structure 
within the scenario. But rather to emphasize the challenging points for an ontology 
based KM- system realized by a peer-to-peer network. 

Corporation With their organization in many different units, entire corporations impose 
the most complex situation, with respect to number of domains, conceptualizations and 

2 http://www.gnutella.com 

3 http://www.groove.net 

4 http://setiathome.ssl.berkeley.edu/ 
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Fig. 1 . Scenarios Overview 



documents, we want to consider here. Typically these units are distributed according 
to organizational tasks, like accounting or human resources, or more product related 
such as development and marketing. The product related units, for example, work on 
one product (topic) with diverse perspectives or on varying products with similar views, 
viz. use the same vocabulary. 

We assume each peer 5 has its own ontology, but with the addition that employees 
working in similar business units use ontologies which have some concepts in com- 
mon while ontologies in unrelated units describing e.g. the same product are not easily 
comparable, viz. use a different hierarchy and vocabulary. 

Our evaluation has to show which techniques make best use of existing ontologies 
in order answer queries according to the user needs. These demands will be examined 
precisely in the future. Therefore queries must reach quickly the peers which can answer 
them, without flooding the network. The answers should be relevant with respect to the 
query. Further the demand for computer resources like storage and processor time has 
to be monitored. 

Working group A special case within a big company is the single department. In this 
case the domain is predefined and terms with the same meaning occur more often in 
each ontology. However, the demand in terms of retrieval accuracy increases. A major 
research question here is, how to capitalize on ontologies from other peers, viz. Self- 
organization is often cited as one of the advantages of peer-to-peer systems. If every 
peer partly conceptualizes information the combination will result in a more detailed 
description for everybody, because each peer can add concepts from other peers to its 
own structure. 

5 A peer can be the computer system of one user or a general database. 
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Very structured department A department of the kind using a very structured process. 
In this case it is possible to define and implement a single ontology which any employee 
has to follow. 



2.2 Summary 

To summarize the single cases from an ontological point of view we distinguish two 
dimensions. The number of domains which are conceptualized and conceptualizations 
used for one domain. From the combination four possibilities evolve. This observation 
is in line with the suggestions in [10]. 

nm ontologies Each peer uses its own ontology. These ontologies conceptualize dif- 
ferent domains. 

nl ontologies Each peer uses its own ontology, but all peers conceptualize the same 
domain. 

lm ontologies There is one general ontology, but it conceptualizes many domains. The 
peers use only parts of the entire ontology. But they can be merged from a top level 
perspective. In this case two different possibilities evolve: 

disjoint The peers commit to a particular part of the ontology. Hence two peers 
use either the same or a different ontology, 
overlapping Each peer has parts of the ontology without respect to the ontologies 
others are using. 

11 ontology Each peer uses the same ontology in one domain. 

From a technical point of view we consider networks with a small numbers of peers 
to huge corporate networks. This means, that different routing strategies have to be 
analyzed. 

The evaluation criteria are straightforward. In all cases the relevance of the answer 
should be high and response time low using little resources of the peers. Further aspects 
are the network behavior if single peers fail or return wrong answers. 

Ontologies provide means to define contexts. The effects on these criteria through 
incorporation of meaning will be evaluated. 

The case studies have demonstrated the kind of peer-to-peer system we focus on. 
To evaluate our techniques we use well established measures from the Information Re- 
trieval and Peer-to-Peer community, but we also have to introduce new ones which take 
the use of ontologies into account. These measures are described in the following sec- 
tion. 

3 Evaluation Functions and Their Parameters 

This section presents a theoretical model of evaluation. In a general overview we define 
the evaluation function followed by its premises. Additionally we present ideas of which 
input and output parameters can be of interest in a DKMS. 
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3.1 Evaluation as a Function 

One can imagine our DKMS as a black box doing information retrieval in a Semantic 
Web environment. This black box is supposed to have a certain behavior from which 
evaluation figures result giving us insight into the DKMS. To test and measure this 
behavior we can adjust different input parameters and collect the output figures. 

This can be modelled as a function. The function (F) describes the setting and the 
basic algorithms used, that is, the interior of our system. Different parameters are used 
as input (i n ) e.g. the number of peers. Specific output figures (o m ) result from it, e. g. 
relevance or performance measures. Input and output in this context are not queries and 
answers of the peer network, they rather are parameters of the DKMS and its method- 
ologies. 

F 

(*1> ?2, . . . , i n ) — > (o 1, 02 , . . . , Om) 

Having discussed the correlation between input and output one can adjust the parame- 
ters until an optimal solution is found. 

This approach is designed along an implementation line with the function repre- 
senting the hard-coded program and the parameters being variables of it. 

3.2 Function Modelling 

The function depends on the algorithms and other properties which will be described 
further. 

Topology. The topology is crucial for the network load imposed by each query. Do 
we want to evaluate random graphs, the star topology, or the HyperCuP environment 
[ 11 ]? Further the content of each peer (and its semantic context) could be used for 
building a network structure. 

Document distribution. The distribution of the documents in a real peer-to-peer 
system is hardly random. The influence of different document distributions on the out- 
put figures will be evaluated. 

Query language. The query language defines the expressiveness of queries. It can 
be interesting to compare performance results of the peer-to-peer system between query 
languages which only allow conjunctions or disjunctions and query languages which 
allow complex recursive queries. 

Selection function. Having a peer structure and a formulated query the next step is 
to find good ways of matching them. How to select and route to the best peers is a core 
component [ 12 ], 

For the reader it might be confusing why the mentioned points belong to function 
rather than to input parameters. In a way the function premises are also input param- 
eters. The difference lies in fact that they are explicitly modelled in the algorithm and 
can not be changed easily. Input parameters on the other hand are more flexible and can 
be adjusted by changing the value of a variable of the algorithm, the algorithm itself 
will stay the same. The next paragraph will show this. 

3.3 Input Parameters 

A list of possible input parameters that can be entered into the system will follow: 
Number of peers. The size of the peer-to-peer network affects the results of the 
system. The scalability of the system is represented by this number. 
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Number of documents or statements. Another type of scalability is checked with 
this parameter. Whereas peers are physical locations, this parameter describes content 
objects. They represent the smallest entities in the system. 

Structure. Most of the decisions around topology directly influence the function. 
But depending on the chosen topology different parameters can be used for further ad- 
justment. When using indexes an important figure is the index size. How much content 
will eventually be stored in the network and how detailed is the knowledge about other 
peers. Slightly different is the level of connectivity or the size of the routing table. These 
are figures representing the characteristics of the network. 

3.4 Output Figures 

The output figures of evaluation functions ensure comparability to other systems. As the 
area of semantic peer-to-peer systems is rather new, there are no established standard 
evaluation functions which makes it difficult to fulfill the first mentioned requirement. 
The following list will provide well-known evaluation functions from related research 
fields. 

Relevance. Relevance is the subjective notion of a user deciding whether the infor- 
mation is of importance with respect to a query. Approximations can be done using, 
e. g., keywords. For comparison purposes one could imagine to have a rating between 
0 and 1 for each answer. 

Recall R. Recall is a standard measure in information retrieval. It describes the 
proportion of all relevant documents included in the retrieved set. 

\relevantr\retrieved\ 

\relevant\ 

Precision P. Precision is also a standard measure in information retrieval. It de- 
scribes the proportion of a retrieved set that is relevant. 

p \relevantr\retrieved\ 

\retrieved\ 

F-measure F. Several combinations of the two first mentioned measures have been 
developed. The most common one is the F-measure [13] describing the normalized 
symmetric difference between retrieved and relevant documents. 

F = %$l P R « with 0 = P/R 

Information loss. A measure to evaluate the loss of information which occurs when 
a query must be generalized on the answering peer. This might happen if the queried on- 
tology does not contain a specific concept, but one which is more general and included 
in the ontology of the requesting peer [14], 

Reliability. This can be split into two sub-parameters. Fault tolerance describes 
which degree of failures and problems are still tolerated until the system finally breaks 
down. Failures in a DKMS can be a peer leaving the network or unacknowledged mes- 
sages. Th e failure rate specifies the percentage of actual breakdowns of the whole sys- 
tem. 

Real time. This measures the time from sending off the query to getting a result. As 
this figure is critical for end users, we take it into consideration as well. It was used in 
[15]. 

Network load. This technical figure can be measured with different sub-parameters. 
This is especially important for internal technical measurements [16]. Messages per 
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query traces to what extent the network is being flooded by one query. The number of 
average hops can indicate how goal-oriented a query is routed and how fast a answer 
may be returned. 

Time to satisfaction, ft is a combination between relevance and real time, with 
relevance having to exceed a certain value[17]. Again this is a very subjective figure. 

3.5 Output Combination 

We have set up a theoretical model for evaluation. The benefit of semantic peer-to- 
peer lies not on its single areas but its strength actually is the combination of them. 
Just like the input parameters come from the different areas of peer-to-peer, Semantic 
Web, and information retrieval, it is also necessary to unite the output figures to achieve 
meaningful results. A possibility would be to arrange linear combinations. Normalized 
vectors represent another. The combination of different output figures will finally allow 
us to decide upon the quality of the new system. 

The output figures will be provided using a simulation package. 



4 P2P Network Simulation 

P2P systems are not set up and maintained by a central authority; thus, creating and 
observing a non-trivial network and measuring the evaluation functions as described in 
the previous section is a hard task. Simulation can help to gain insight into the behavior 
of the system. Many research contributions such as Freenet [18] and Anthill [19] have 
used simulations in order to demonstrate the performance of their systems. Simulation 
is a core component for evaluation. 

4.1 Discrete Event Simulation (DES) 

Discrete Event Simulation observes the behaviour of a model over time [20] . The model 
has a state described by variables of the model that completely define the future of the 
system. The state of the model is usually encapsulated into a set of entities (cp. objects 
in OOP). Discrete Events changing the state of the system occur at discrete points in 
time (as opposed to continuous state changes). Events may trigger new events. Statis- 
tical Variables define the performance measure the user is interested in. This could be 
something like “average load on the server” or “maximum queue length”. 

Event oriented DES describes the dynamic behavior of the system solely by a se- 
quence of events; the actions triggering the events are not considered. Process oriented 
DES combines the entities containing the state of the system and the actions that cause 
events (cf. OOP). 

Typical Components of Simulation Software Packages DES software typically includes 
abstractions for entities, connections between entities, and events transmitted on those 
connections (see fig. 2), which corresponds well to the P2P scenario. Process oriented 
packages also include an abstraction for processes running on entities. Some simulators 
provide a glue language which can be used to compose models easily. 




